Multi-node / distributed setup (scaling to huge datasets / multi-tenant) #213

joepio · 2021-11-19T14:22:23Z

Atomic-server has been designed to run on low-end hardware, as a self-hosted server. This was why I decided to use an embedded database (sled) and search engine (tantivy).

However, this introduces a problem. What do you do when the physical constraints of a single machine are exceeded by the demands? For example, when a single CPU is not fast enough to host your data, or if the RAM is bottlenecking performance, or if the disk size is insufficient? This is where distributed setups come in handy, but the current architecture is not designed to deal with this.

This thread is for exploring what might need to happen to facilitate a multi-node setup.

Thoughts

I think it makes sense to send all incoming Commits to all nodes.
I think using actix+websockets, sending binary object over the network should work pretty well
We need new messages for registering new nodes. They should probably tell the network that they would like to receive some range of resources that they are responsible for.
Both the subject-property-value store and the value-property-subject stores should probably be distributed. Should nodes specialize in one or the other?
I have no idea on how to distribute tantivy, but the docs suggest that it's been designed to allow for distributed search. QuickWit search might be a nice example of how to approach this.
We could utilize the fact that Drives are stored on different subdomains. We could have one node that selects where to redirect the traffic to, dependingon the subdomain.

Interesting tools

SeaWeedFS, virtual filesystem / distributed storage system
Firecracker for running multiple VMS on one device
tikv distributed KV store (used as an alternative storage solutions to sled in the monolith project, might be inspring)

The text was updated successfully, but these errors were encountered:

jonassmedegaard · 2021-11-19T14:31:58Z

Perhaps there is some wisdom to borrow from 4store - or maybe ask its author, Steve Harris, to provide input here?

joepio · 2021-11-19T14:59:16Z

Thanks!

joepio · 2021-12-22T14:14:41Z

A different approach to multi-node, is FAAS. Convert all logic to stateless functions. Has some interesting benefits to scalability - no longer needed to copy instances and all their logic.

K/V store for persisting data (e.g. DynamoDB)
Convert all server logic and actor logic to Lambda functions

joepio · 2022-08-02T14:22:25Z

@AlexMikhalev has made some interesting progress on having a multi-node Atomic-Server setup. See the aws-ops repo.

He suggests:

One docker instance per user

use Github Username to create subdomains
Spin up atomic-server instance per user
Manage with pulumi (i.e. terraform that you can use with any language, like JS)
Let instances sleep (write memory to disk) when not used using fargate to minimize costs. It currently uses 1 VCPU with 1GB RAM

Some thoughts about this approach:

Might be costly to host if you do this for a lot of users
Start-up time is pretty slow, which leads to suboptimal onboarding. But this can be pre-cached probably. Creating the FargateService is currently done only to pass new env vars and start an instance, but we can probably take this out of the requested URL. We can improve on this if atomic-server doesn't depend on an ENV for BASE_URL.

joepio · 2022-09-09T07:16:37Z

Another approach:

Link Drives to Nodes

Every Drive (workspace) is linked to one specific machine.
Requires only minimal coordination between nodes. With a smart URL strategy, we could simply route to the right IP by using URL paths
Maximum size of one workspace is linked to one machine. But I don't think it's likely that many customers will user more than one node's full capacity.
As instances grow, we might need to move all data from one machine to another. Seems doable.

AlexMikhalev · 2022-09-09T07:57:51Z

Check out #463 - the sync between nodes was the whole point of using fluvio.
The bit which I didn't finish - is to remap atomic data URLs inside the smart module before the record reaches the streaming platform.

joepio · 2022-11-30T14:17:32Z

Another approach:

Use a multi-node KV store

Wherever we're currently using sled, we could use tikv (for example). This means instead of using an embedded key-value store, we'll use an external one that supports multi-node setups.

The big advantage is that Atomic-Server instances are no longer tied to specific data. In other words, tikv deals with the complexities of having a multi-node system.

However, there are still some stateful resources that are currently linked to machines. Plugins, specifically. Also, we'll still need to make sure that the watched_queries are scoped to specific workspaces, but that's an easily fixable problem.

For now, let's ignore plugins. Because even if we fix that problem, I'm not sure the solution will be very performant. There is quite a bit of overhead required when our storage system is on a different machine. Where we now get nanosecond responses, this will definitely become milliseconds. However, tikv still is very fast - on average one response takes less than 1 ms as well. This only becomes a bit worrisome if we do multiple consecutive gets that require multiple round trips. There are probably ways to work around this. I'm assuming there are ways we can do batch requests in their client.

2024-01 EDIT: This doesn't solve all issues:

What happens if the bottleneck isn't persistence, but compute? E.g. if a lot of Commits have to be processed at the same time. In that case, we can scale the KV all we want, but we're not fixing the core issue.
How do we distribute search? We'll need another tool for this, too.

I think this approach will not be sufficient. Back to the drawing board.

joepio · 2023-01-07T21:54:40Z

https://github.com/lnx-search/datacake/tree/main/examples/replicated-kv

Data cake also seems interesting, I'll take a closer look soon

joepio · 2024-01-05T12:57:11Z

Coordinator node approach

This is a load balancer / proxy server / rust server (actix / axum?) that forwards requests to AtomicServer instances, depending on the used subdomain (drive). Note that it needs to be aware of which node hosts which drive.

This means:

We don't need distributed KV stores / distributed search solutions!
Every drive is limited to the size of one node. Is this a problem? I don't expect it to, anytime soon. We can still use S3 storage for files, if persistence becomes the limiting factor per node.

Existing tools

AWS HAproxy (proprietary, deprecated)
Sozu (reverse proxy written in rust, configurable at runtime). Seems interesting.

Write our own

e.g. in Axum or Actix
Allows for a lot of customization
Requires moving / implementing HTTPS / TLS logic. Yuck.

joepio changed the title ~~Multi-node / distributed setup~~ Multi-node / distributed setup (scaling to huge datasets) Dec 22, 2021

joepio changed the title ~~Multi-node / distributed setup (scaling to huge datasets)~~ Multi-node / distributed setup (scaling to huge datasets / multi-tenant) Aug 2, 2022

joepio mentioned this issue Oct 4, 2022

consider alternative KV store #433

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-node / distributed setup (scaling to huge datasets / multi-tenant) #213

Multi-node / distributed setup (scaling to huge datasets / multi-tenant) #213

joepio commented Nov 19, 2021 •

edited

Loading

jonassmedegaard commented Nov 19, 2021

joepio commented Nov 19, 2021

joepio commented Dec 22, 2021 •

edited

Loading

joepio commented Aug 2, 2022 •

edited

Loading

joepio commented Sep 9, 2022

AlexMikhalev commented Sep 9, 2022

joepio commented Nov 30, 2022 •

edited

Loading

joepio commented Jan 7, 2023

joepio commented Jan 5, 2024 •

edited

Loading

Multi-node / distributed setup (scaling to huge datasets / multi-tenant) #213

Multi-node / distributed setup (scaling to huge datasets / multi-tenant) #213

Comments

joepio commented Nov 19, 2021 • edited Loading

Thoughts

Interesting tools

jonassmedegaard commented Nov 19, 2021

joepio commented Nov 19, 2021

joepio commented Dec 22, 2021 • edited Loading

joepio commented Aug 2, 2022 • edited Loading

One docker instance per user

joepio commented Sep 9, 2022

Link Drives to Nodes

AlexMikhalev commented Sep 9, 2022

joepio commented Nov 30, 2022 • edited Loading

Use a multi-node KV store

joepio commented Jan 7, 2023

joepio commented Jan 5, 2024 • edited Loading

Coordinator node approach

Existing tools

Write our own

joepio commented Nov 19, 2021 •

edited

Loading

joepio commented Dec 22, 2021 •

edited

Loading

joepio commented Aug 2, 2022 •

edited

Loading

joepio commented Nov 30, 2022 •

edited

Loading

joepio commented Jan 5, 2024 •

edited

Loading