Maintain internal and external-facing distributed machine learning and protocol infrastructure

This is a rare point-in-time opportunity: to work on one of the world’s most important technology problems while upending the established political and corporate interests that control and price gouge it. Gensyn will allow machine learning engineers and researchers to train models at a higher scale, and lower cost, than AWS; achieved via a highly specialised deep learning compute protocol with minimal verification overhead (read more in our Litepaper).

<aside> πŸ›  Written in Rust and Python: a trustless protocol that rolls up work from off-chain ML runtimes into a Substrate blockchain for decentralised consensus

</aside>

<aside> 🧭 Autonomous environment: fully remote, flat hierarchy, low/no rules: just pure focus on delivering the compute protocol that will push the frontiers of artificial intelligence

</aside>

<aside> πŸ’° Backed by leading crypto infrastructure and deep learning investors, including: Eden Block, Galaxy Digital, Maven 11, CoinFund, Hypersphere, Zee Prime, PEER, Entrepreneur First, Counterview Capital, 7percent, and id4; as well as angels from DeepMind, Livepeer, Pocket, The University of Cambridge, Twitter, Google, Parity Technologies, and more

</aside>

Responsibilities

πŸ‘‰ Manage cloud and distributed on-premise resources - build integrated workflows over heterogeneous compute resources both self-hosted and from multiple cloud providers

πŸ‘‰ Maintain development & production clusters - use up-to-date virtualisation and container orchestration tooling to maintain internal and external-facing auto-scaling infrastructures

πŸ‘‰ Provide CI infrastructure - aid and partake in the design of hyper efficient build processes and make best use of available computational resources ****

πŸ‘‰ Develop decentralised monitoring systems - build logging and model training monitoring systems in a uniquely challenging decentralised environment

πŸ‘‰ Write - contribute to technical reports / papers describing the system and discuss with the community

πŸ‘‰ Optimise cloud spend - monitor cloud resources and proactively automate

πŸ‘‰ Backend - occasionally write some simple backend systems

Minimum requirements

βœ… Experience with container orchestration systems - setting up and maintaining kubernetes, nomad, or similar clusters (ideally over GPUs too)

βœ… Experience with virtual networking - building and managing all network infrastructure (over cloud and on-premises compute) following best practices (VPNs, hardening, firewalls/netfilter, ..)

βœ… Experience with host deployment automation - previously set up and used automation tools like Ansible, or Salt or similar

βœ… Experience with observability tools - covering log gathering, ingestion, processing up to dashboards, using tools such as influxdb, capacitor, grafana, prometheus, jaeger, and/or others

βœ… Passion for decentralisation - a strong desire to see the internet’s infrastructure ripped form the hands of centralised cloud providers

βœ… Appreciation for early stage start-ups - demonstrable experience working in environments that move with extreme speed and volatility