Wednesday, May 31, 2017

Distributing AI Computation to the Clients

TriadCity relies on a lot of AI. AI-driven characters have employers to report to, homes to sweep up, meals to cook, hobbies to pursue, dates to go out on. Sports leagues play games which simulate real life; AI-driven commodity speculators calculate optimal bids. It's a lot of AI. The old-school way to scale would be to verticalize the game server: more CPUs, more RAM, bigger-faster-more. The game-y way would be to shard the game world across multiple servers horizontally, which for now we want to avoid. Either way, we're a tiny little struggling game company and we want to keep our costs as close to nil as possible. Besides, brute force seems really kinda conceptually offensive for tasks like these.

There's a better way. We can borrow cycles from players' computers by offloading portions of these computations to our TriadCity clients. It's a form of matrix computing which distributes our scaling problem to many connected machines.

We do a lot of this. I'll note two contexts here which are interesting to me.

NLP is handled by the client. Players type natural English: "Show me what’s inside the red bag"; or, "What’s in the bag?" Client code parses those inputs before transmitting them to the server, translating them to canonical MUD-style commands based on old-timey DIKUmud. In this example, the client sends "l in red bag" or "l in bag". Offloading this computation is very helpful to keeping the server computer small. Individually these computations aren't taxing, but when you scale them to tens of thousands per second you're talking nontrivial horsepower. Pushing that CPU cost to the client is a simple form of grid computing which is easy to implement and makes sense in our context.

Somewhat more elaborately, we use the clients to pre-compute sporting events, for example, chariot races modeled with accurate 2D physics. Think of SETI@Home: we push a data blat to the client with the stats of the chariot teams, their starting positions and race strategies; the client computes the race move-by-move, and sends a computed blat back to the server. The server caches these as scripts for eventual playback. To players the races appear to be real-time, but, they’re actually pre-determined by these client-side calculations. The arithmetic really isn’t that complex, but, it’s easy to distribute, and we want to take advantage of whatever server-side processor savings we can.

There's no inconvenience to players. Consumer computers spend bajillions of cycles idling between keystrokes. We borrow some. Nobody notices.

The major downside is that TriadCity players are locked to a specialized client: for example they can't use dumb telnet clients. This is an obstacle for blind players. Telnet clients are simple for screen readers to manage. Our GUI clients have been challenging to integrate with screen readers. As a workaround we've written a specialized talking client — which many sighted players turn out to prefer! — I use it most of the time. But this isn't really where we want to live. We could enable telnet clients and forgo NLP for their users. We dislike that idea a lot — it would violate our commitment to enabling blind players as first-class citizens in our universe — but we may go ahead and do it anyway based on player requests. Haven't decided yet.

Pushing AI to the client keeps the server small-ish. It's currently an 8-core, 16gb one-unit pizza box in a co-lo, which we intend to migrate to AWS in a few months. Despite all the AI, it can handle tens of thousands of players no sweat. If we actually had that many players we'd be thrilled.

Please bring your friends.

Intel Titanium 9500

No comments:

Post a Comment