Some additional interesting tech stories I would add:
- in 2010, the bank had retail Good Till Cancel orders from 1997. I think one was "Buy INTC at $6"
- There is a mix of "I didn't know technology could do this" in the good sense and "I'm amazed this code a. works at all and b. hasn't had an outage in 6 years"
- There is a strong desire, I chose this word carefully, to migrate off of legacy systems. That being said there are several; big issues: 1. it's a GIGANTIC amount of effort with often unclear ROI to the business, 2. upside is capped (maybe you get a promotion) but downside risk is huge (you could tank the business with an outage). 3. Slow, gradual refactors are generally better here but some things can only be "big bang" for various reasons
- You tend to see old but performant and battle tested systems get retired in favor of shiny, new systems with lots of bugs. Why? It looks better on a resume to say "I retired old, crufty legacy system and rolled out a new system" instead of "I refactored old system to be better"
- The complexities are wild e.g. Korean trading requires: a. traders to be licensed in Korea (even if they are working in NYC), b. servers to be in Korea c. tagging orders with not only their executions but also the exchange rate at the time
- There are entire SYSTEMS built to track trade breaks (e.g. Bank A doesn't agreed with Bank B on fill 1248383). Some of these trade breaks are open for for YEARS due to litigation, companies going out of business etc
I could go on and on about this.
If anyone is ever interested in having me on a podcast to talk more about it, I would totally be up for it.
Some of the folk that built that (or worked on it) ended up at JPM and Merrill where they built the Python centric version - Alpha and Quartz respectively. Barclays Capital has/had a similar system as well I think, but it’s not one I know about offhand - they did though, memorably, have a system that was pretty much Haskell-in-Excel.
[0] https://www.slideshare.net/slideshow/managing-python-at-scal...
Genius coder, yes. Nice guy, most definitely yes.
The answer is often that the battle-hardened mature off-the-shelf solution did not exist at the time the code was written. You're doing software archaeology.
Every patch delay puts more pressure on you and your team to fork the codebase and go it alone. You and your team sit down and promise you’ll rebase over upstream releases and everyone nods wisely. Then you skip a release, and another, and presto: you now you have Bank Redis or Bank Selenium or Bank Hadoop trapped on the last version of upstream before the fork but to which you can patch changes as fast as you like. I’d liken this to crossing an event horizon except the astronaut sees the universe freeze and fade away instead of the outside observer.
It’s possible to make it work if the upstream project either gives you a majority vote (or at least a substantial share of the vote) on project direction, or you’re working on a project large enough to have lots of corporate (ie funded, high velocity) stakeholders already.
I've rolled with the "we'll keep a local patch against upstream" for small changes before, which helps keep on track with upgrades, but depends how feasible that is.
The whole idea was actually to use as much existing Open Source technology as possible. Hence Python and it's rich library ecosystem, instead of something home grown like SecDB/Slang. This was supplemented with proprietary infrastructure and libraries only where there was a clear need. For example a Directed Acyclic Graph library to ease migrations from the Excel sheets used by Quants. The distributed object store was pretty neat.
You could code up a basic web service with minimal functionality and have it running in nonprod in an afternoon, and then production the day after. All that boilerplate stuff was super low friction, so you could spend much more of your time on solving the actual problem.
> Time to drop a bit of a bombshell: the [Barbara] source code is in Barbara too, not on disk. Remain composed. It's kept in a special Barbara ring called sourcecode.
> The ZODB is an (almost) transparent python object persistence system, heavily influenced by Smalltalk.
https://zodb.org/en/latest/articles/ZODB-overview.html#compa...
I think Jim Fulton and other authors of Zope originally came from Smalltalk
- Unified interface for object stores
- Source code stored with data files
- Job runner
I also see some similarities with Lisp machine, the fact Python also has a REPL, and able to dump/restore image state (but in this case discrete objects are serialized, not the entire memory).
This might sound crazy for people used to having 90% glue code / 10% business codebases, but to me seems like a very efficient way to have users directly drive what is effectively a large computer, and more like how things used to be.
The drawback is that it seems to be a monolith, and maybe hard to reimplement on top of more modern foundations. But as a general API, it seems to make sense.
Of course, financial institutions have a lot of “secret sauce” - such as financial models - you’d never expect them to release.
But this kind of underlying infrastructure isn’t really “secret sauce”
The more they use cloud-hosted LLMs, the more likely it will get leaked into training data.
I’ve had clients ring up about pennies… it can be crazy what some people are motivated by
Quite common in accounting, the accounting equation must balance, it's like a checksum
Highly agree with this. I think it's very underappreciated in startups that if you want people to deploy a lot of small services you have to make that really super easy. I always thought that the value of things like Spark is that you can run "things" without having to worry about how they run. K8s is similar but much more complex. AWS Lambda is nice but also comes with a lot of baggage at scale. I always wanted to try something like Dapr, which seems to provide a very opinionated happy path for application development.
can_view(Person) :- didPITAOnlineTraining(Person), ...Also...these things tend to have fuckin terrible documentation. Good luck figuring any of this out. And you can't google it and your AI is just as lost as you
- source code in database: yes
- own IDE for questionable reasons: you bet!
- custom table objects: we got your back.
- strange forks of common python libraries: would you like warnings with that?
I convinced my boss to hire an intern for the summer to do this. They said: "wouldn't internship projects that involved actual coding be more attractive?"
I replied: "Well, they'll be having to do a lot of experimenting to figure things out..."
It seems like any giant organization eventually develops its own software center of gravity.
The only real difference there (although it is a significant one) is that most of those internal Google tools tended to be very good, often ahead of the external state-of-the-art. That's a very different feeling to a baroque old stack inside a bank somewhere. Maybe the external world has caught up on a bunch of them more recently though which would start to change that.
These initiatives were independent of Minerva and Athena, which was good but not very useful to the more mundane parts of the bank everyone off the financial trading floors care about.
Does anyone working at one of these banks or similar know if this information still holds true?
And have any of the banks started using uv yet? Or will they forever be using pip?
SC has its own Haskell compiler that produces bytecode that you can run locally, serialize, send to be executed somewhere else, etc. Most of the code still lived in a monorepo, though.
We did have a global data store (well, several) that any code could access. I was working on a more "normal" application that was still written in the SC haskell dialect but otherwise mainstream architecture -- postgres, deploying to a boring linux server, etc.
A colleague once described our dialect as "Python that looks like Haskell". This is an exaggeration, but a) we did use a lot of untyped dicts and everything-is-a-giant-relational-table structures, and b) my understanding is that the actual financial modelling was done in C++ and the SC Haskell was glueing things together. Idk.
About uv -- I did try to convert ppl to uv but it probably didn't spread further than my few colleagues at the Warsaw office.. well and also I merged a monorepo-wide documentation system that used sphinx and uv, but idk if it's still alive after I left.
There was certainly a widespread understanding that doing things the "Bank Way" made recruitment difficult, and they hypothesised that it was also a significant drag on their ability to turn around new projects. The main goal of the new platform was to provide an alternative way of doing things which would allow them to quantify that drag.
I know that the pilot was completed, and it went on to a more widespread deployment - but my involvement with it had already ended so I can't say if it actually proved their hypothesis / provided the quantitative data they wanted.
Right out of the gates, it's crazy how this contrasts with Mercury's Haskell infra
Barbara is a company wide database, it handles data storage.
When I read about internal app state being stored in Barbara I'm interpreting that the policy is for the data to be centralized for more vertical control.
While the Temporal thing sounds like if something is written, it's done so in a containerized like manner, and other processes can't just read it.
It stores the app's work-in-progress state as well (probably as a blob full of serialised internal datastructures, at least in some cases):
> You write your workflow as ordinary sequential code, and the platform records every step in an event history. If a worker crashes mid-workflow, another worker replays the deterministic prefix to reconstruct the state, then continues from where it left off.
> When I read about internal app state being stored in Barbara I'm interpreting that the policy is for the data to be centralized for more vertical control.
That wasn't the way I experienced it, if anything it was the opposite: app developers would push to use Barbara for their internal state because it was easy: the app is already accessing it, the APIs are simple, and since it's just pickled objects you can just store your state without having to worry about serialisation (much) or ORM. Whereas policy and leadership would if anything prefer you to use a separate traditional database. The point of Barbara is to provide a unified interface onto "everything the bank knows", it's primarily for data that multiple teams use, not internal state owned by a single team.
But also I suppose you may be saying exactly this?
There's a very big difference between the kind of bank you walk into to get a checking account, versus one that has no (individual) customers and whose job it is to assist with IPOs or whatever.
s/barbara/sandra/gIs this really the case? I'm sure there are plenty of transactions that for umpteen different reasons must not be exposed on a global level.
But anyway, specific trades are rarely private to one part of the bank for many reasons. For example regulatory: these days you have to notify the regulator about every trade.