Fault Tolerant Staking
Unlike a single system, distributed systems have partial failures. Overall failure of a single system tends to make the whole system down. On the other hand, in a partial failure, the system can continue to operate while recovering from a partial failure without seriously affecting the overall performance. Fault tolerance is defined as follows The ability to endure service even if failure occurs In addition, a system with fault tolerance is sometimes called a high dependability system, and requirements related to dependability system are classified into the following four. Failure model Typical failure for processes in a distributed system are the following four: Faults for a communication link are classisied as well. For Byzantine failures, for example, delivery of false messages etc may occur, so it is the most bad and difficult to deal with. Failure can be hidden by redundancy. This is easy to understand, for example considering that mammals have two eyes, ears, and lungs. Even if some of these distributed organs fail, you can use the system while hiding the breakdown. This is called physical redundancy. There are three types of redundancy: information redundancy, time redundancy, and physical redundancy. At Obol, we are researching and building an infrastructure primitive called Distributed Validator Technology. DVT enables a new kind of validator, one that runs across multiple machines and clients simultaneously but behaves like a single validator to the network. This enables your validator to stay online even if a subset of the machines fail, this is called Active/Active fault tolerance. Think of it like engines on a plane, they all work together to fly the plane, but if one fails, the plane isn't doomed. Obol's mission is to enable and empower people to share the responsibility of running the network. If you are part of a distributed validator cluster, and your machine dies overnight, the other operators in your cluster will have your back. You'll cover for them some other time when they go on holidays for a week and their node falls out of sync. If we can share the responsibility of running nodes, we can open a new frontier of decentralisation. Solo validators can have backup. Staking firms can share risk and reward. DeFi protocols can diversify their staked ether exposure. Major institutions can hedge cloud provider risk. There's a benefit to everyone for building fault tolerant, distributed validator tech. The Staking Problem So how does high-availability validators help stake centralisation Oisín? Here's my take: Right now you take a massive bet on the person/team that is running your validator for you. If they do everything right, they make you a couple percent of interest a year, if they do everything wrong, they lose it all. The decentralised staking industry is extremely nascent, and we haven't figured out how best to build trust-minimised staking for the community. Projects like Lido pool risk across everyone, projects like RocketPool isolate risk into individual pools. One gates entry with humans and votes, the other gates entry with tokens and bonding.