Serengeti - Service Discovery and Configuration Management

  • Home /
  • Serengeti - Service Discovery and Configuration Management

Serengeti: The Autonomous Distributed Database

Press Release

Ataiva Releases Serengeti: Truly Autonomous Distributed Database for Modern Applications

For immediate release - June 2, 2019

Sub-headline

Serengeti revolutionizes distributed databases with a zero-configuration, self-organizing system that automatically discovers peers on the same subnet and replicates data without any human intervention.

The Problem

Organizations deploying distributed databases face significant operational challenges: complex configuration, manual cluster management, tedious replication setup, and fragile recovery processes. Traditional solutions require specialized expertise, extensive documentation, and constant maintenance. This complexity creates a high barrier to entry for teams wanting to leverage distributed databases, increases operational overhead, and introduces numerous opportunities for human error that can lead to data loss or system outages.

The Solution

Serengeti addresses these challenges with a revolutionary autonomous distributed database that requires zero configuration or management. Simply start Serengeti on any number of machines on the same subnet, and each instance automatically connects to the others to create a fully functional distributed database. Data is automatically replicated across the network, and when new nodes join, they immediately receive the existing database structure and data. If a node fails, the system automatically detects the failure and redistributes its data to other nodes after a brief recovery period. This autonomous approach eliminates configuration complexity and ongoing maintenance, allowing teams to focus on their applications rather than database administration.

Community Quote

“Serengeti has completely transformed how we approach distributed data storage,” says Michael Chen, Lead DevOps Engineer at CloudScale Solutions. “Before implementing Serengeti, we spent countless hours configuring and maintaining our distributed database cluster. Every time we scaled up or had a node failure, it required manual intervention and often led to cascading issues. With Serengeti, we simply deploy it on our machines, and it handles everything automatically. When we recently had an unexpected server outage, Serengeti seamlessly redistributed the data and maintained availability without any human intervention. It’s reduced our operational overhead by at least 70% while significantly improving our system reliability.”

How It Works

Serengeti employs a unique autonomous architecture that eliminates the need for manual configuration:

When a Serengeti node starts, it immediately begins scanning the subnet for other instances. Once peers are identified, nodes automatically establish connections and begin replicating data. The system continuously monitors the health of all nodes and automatically adjusts the replication strategy as nodes join or leave the network.

If one of the instances dies, the other nodes will check back and wait for a short recovery period before reallocating the database pieces that were on that node to other nodes across the network. This self-healing approach ensures data availability even during node failures.

Interaction with Serengeti is simple—once running, you connect to the dashboard at http://<localhost_or_node_ip>:1985/dashboard to manage your distributed database. The intuitive interface allows you to create and manage databases, tables, and data without needing to understand the complexities of the underlying distributed system.

Availability

Serengeti is available now as an open source project under the MIT license. Developers can access the source code, documentation, and pre-built JAR files on the GitHub repository or download the latest release from the releases page .

Get Started Today

Start building resilient, self-managing distributed databases with Serengeti. Visit the GitHub repository to download the database and contribute to its development.

Frequently Asked Questions

Project Questions

What is Serengeti?
Serengeti is an autonomous distributed database system that requires zero configuration or management to set up or maintain. It automatically discovers other instances on the same subnet, establishes connections, replicates data, and handles node failures without any human intervention. This makes it ideal for teams who want the benefits of a distributed database without the operational complexity typically associated with such systems.

Why was Serengeti created?
Serengeti was created to prove that distributed databases don’t need to be complex to set up and maintain. Traditional distributed databases require extensive configuration, ongoing management, and specialized expertise, creating significant operational overhead. Serengeti aims to eliminate this complexity with a truly autonomous approach that requires no human intervention, allowing teams to focus on their applications rather than database administration.

What makes Serengeti different from other distributed databases?
Serengeti differentiates itself through several key innovations:

  • Zero Configuration: Requires absolutely no manual setup or configuration
  • True Autonomy: Self-organizes and self-heals without human intervention
  • Automatic Discovery: Finds other instances on the same subnet without configuration
  • Seamless Data Replication: Automatically replicates data across the network
  • Intelligent Recovery: Handles node failures and redistributes data automatically
  • Simple Interface: Provides an intuitive dashboard for database management
  • JVM-Based: Runs on any machine with a Java Virtual Machine

What types of applications benefit most from Serengeti?
Serengeti is particularly valuable for:

  • Development and testing environments where quick setup is essential
  • Small to medium-sized applications requiring distributed data storage
  • Teams without dedicated database administrators
  • Educational environments for learning about distributed systems
  • Prototyping and proof-of-concept projects
  • Applications where operational simplicity is a priority
  • Environments where node failures are common and automatic recovery is essential

Technical Questions

How does Serengeti discover other nodes?
Serengeti automatically scans the local subnet to discover other instances. When a node starts, it broadcasts its presence and listens for responses from existing nodes. This subnet-based discovery means that all Serengeti instances must be on the same subnet to form a cluster, but requires no manual configuration of connection details.

How does Serengeti handle node failures?
When a node fails, Serengeti implements a careful recovery process:

  1. Other nodes detect the failure through missed heartbeats
  2. The system waits for a short recovery period to see if the node comes back online
  3. If the node doesn’t recover, the system automatically reallocates the database pieces that were on that node to other nodes across the network
  4. Data consistency is maintained throughout this process to prevent data loss

How is data replicated in Serengeti?
Serengeti automatically replicates data across multiple nodes to ensure availability and fault tolerance. When data is written to any node, the system:

  1. Determines the appropriate replication strategy based on the current cluster size
  2. Distributes copies of the data to selected nodes in the cluster
  3. Ensures consistency across all replicas
  4. Updates replication metadata so all nodes know where data is stored

When a new node joins the cluster, it automatically receives copies of existing databases and tables along with all replication information.

What are the system requirements for running Serengeti?
Serengeti has the following requirements:

  • JDK 11 or newer
  • Network connectivity on the same subnet as other nodes
  • Sufficient memory and disk space for your data needs
  • Port 1985 available for the dashboard interface

Implementation Questions

How do I install and run Serengeti?
You have several options for installing and running Serengeti:

Option 1: Download the pre-built JAR

# Download from the releases page
# https://github.com/ao/serengeti/releases

# Run the JAR
java -jar serengeti.jar

Option 2: Build with Maven

# Clone the repository
git clone https://github.com/ao/serengeti.git
cd serengeti

# Build the project
mvn clean install

# Run the application
java -jar target/serengeti-1.0-SNAPSHOT.jar

Option 3: Run in IntelliJ IDEA

  1. Clone the repository
  2. Open in IntelliJ IDEA
  3. Edit configurations
  4. Add a new Application configuration
  5. Set the classpath to “Serengeti” and the Main class to “Serengeti”
  6. Run the application

How do I interact with Serengeti once it’s running?
Once Serengeti is running, simply connect to the dashboard interface at:

http://<localhost_or_node_ip>:1985/dashboard

This dashboard provides a user-friendly interface for managing your distributed database, including creating databases and tables, inserting and querying data, and monitoring the cluster status.

How can I test Serengeti?
Serengeti includes comprehensive test suites:

Running the full test suite:

mvn test

Running the fast test suite (completes in under 2 minutes):

# On Linux/Mac
./run_fast_tests.sh

# On Windows
run_fast_tests.bat

# Or directly with Maven
mvn test -Pfast-tests

What should I do if I encounter issues with Serengeti?
If you encounter any problems or need assistance:

  1. Check the GitHub issues to see if your problem has already been reported
  2. Create a new issue with details about your problem
  3. Include information about your environment, steps to reproduce the issue, and any error messages

Community & Support Questions

How can I contribute to Serengeti?
Contributions to Serengeti are welcome in many forms:

  • Code contributions via pull requests
  • Bug reports and feature requests via GitHub issues
  • Documentation improvements
  • Testing across different environments
  • Sharing use cases and examples
  • Answering questions in the community forums

Where can I get help with Serengeti?
Support resources include:

  • GitHub issues for bug reports and feature requests
  • Documentation in the GitHub repository
  • Community discussions in the repository’s Discussions section
  • Example configurations and tutorials

Is Serengeti production-ready?
Serengeti is currently in early development (version 0.0.1). While it demonstrates the potential for autonomous distributed databases, we recommend thorough testing in your specific environment before considering it for production use. The project is actively developed, and future versions will include additional features and stability improvements.

What’s on the Serengeti roadmap?
Upcoming features and improvements include:

  • Enhanced security features
  • Advanced query capabilities
  • Performance optimizations
  • Additional client libraries
  • Extended dashboard functionality
  • Improved monitoring and observability
  • Cross-subnet clustering options

How Nodes See One Another

Node Network Visualization

Serengeti nodes automatically discover and connect to each other on the same subnet, forming a resilient mesh network. Each node maintains awareness of other nodes in the system, enabling efficient data replication and request routing without central coordination.

How an Individual Node Operates

Node Architecture

Each Serengeti node operates autonomously with multiple components working together to provide distributed database functionality. The node continuously monitors the network, processes requests, and synchronizes data with peers to maintain system consistency.

Dashboard View

Dashboard Interface

The Serengeti dashboard provides a real-time view of the cluster status, including node health, database structure, and replication status. This intuitive interface makes it easy to manage your distributed database without complex commands or configuration files.

Key Features

FeatureDescription
Zero ConfigurationSimply run Serengeti on any machine with a JVM and it automatically discovers peers on the same subnet without any manual setup
Autonomous OperationSelf-organizing and self-healing system that requires no human intervention for normal operation
Automatic Data ReplicationData is automatically replicated across the network for fault tolerance and high availability
Seamless Node IntegrationNew nodes automatically receive existing database structure and data when they join the cluster
Intelligent Failure RecoverySystem automatically detects node failures and redistributes data to maintain availability
Intuitive DashboardWeb-based interface at port 1985 for easy database management and monitoring

Use Cases

Use CaseDescription
Development EnvironmentsQuickly set up a distributed database for development and testing without complex configuration
Educational SettingsLearn about distributed systems concepts with a practical, easy-to-use implementation
Small to Medium ApplicationsProvide distributed data storage for applications without dedicated database administration
PrototypingRapidly develop and test concepts that require distributed data storage
Edge ComputingDeploy autonomous database capabilities in edge environments with minimal management
Resilient Data StorageStore data with automatic replication and failure recovery for improved availability