Our concurrency control schemes are designed for a partitioned main-memory database system similar to H-Store. This section gives an overview of the relevant aspects of the H-Store design.
Traditional concurrency control comprises nearly half the
CPU instructions executed by a database engine. This
suggests that avoiding concurrency control can improve
throughput substantially. H-Store was therefore designed
explicitly to avoid this overhead.
2.1 Transactions as Stored Procedures
H-Store only supports executing transactions that have
been pre-declared as stored procedures. Each stored procedure invocation is a single transaction that must either abort
or commit before returning results to the client. Eliminating
ad-hoc transactions removes client stalls, reducing the need
for concurrency.
2.2 No Disk
Today, relatively low-end 1U servers can have up to 128
GB of RAM, which gives a data center rack of 40 servers
an aggregate RAM capacity of over 5 TB. Thus, a mod-
est amount of hardware can hold all but the largest OLTP
databases in memory, eliminating the need for disk access
during normal operation. This eliminates disk stalls, which
further reduces the need for concurrent transactions.
Traditional databases rely on disk to provide durability.
However, mission critical OLTP applications need high availability which means they use replicated systems. H-Store
takes advantage of replication for durability, as well as high
availability. A transaction is committed when it has been
received by k > 1 replicas, where k is a tuning parameter.
2.3 Partitioning
Without client and disk stalls, H-Store simply executes
transactions from beginning to completion in a single thread.
To take advantage of multiple physical machines and multiple CPUs, the data must be divided into separate partitions. Each partition executes transactions independently.
The challenge becomes dividing the application’s data so
that each transaction only accesses one partition. For many
OLTP applications, partitioning the application manually
is straightforward. For example, the TPC-C OLTP benchmark can be partitioned by warehouse so an average of 89%
of the transactions access a single partition. There is
evidence that developers already do this to scale their applications, and academic research provides some
approaches for automatically selecting a good partitioning
key. However, unless the partitioning scheme is
100% effective in making all transactions only access a single partition, then coordination across multiple partitions
for multi-partition transactions cause network stalls and executing a transaction to completion without stalls is not possible. In this paper, we focus on what the system should do
in this case.
3. EXECUTING TRANSACTIONS
In this section, we describe how our prototype executes
transactions. We begin by describing the components of our
system and the execution model. We then discuss how single
partition and multi-partition transactions are executed.
Traditional concurrency control comprises nearly half the
CPU instructions executed by a database engine. This
suggests that avoiding concurrency control can improve
throughput substantially. H-Store was therefore designed
explicitly to avoid this overhead.
2.1 Transactions as Stored Procedures
H-Store only supports executing transactions that have
been pre-declared as stored procedures. Each stored procedure invocation is a single transaction that must either abort
or commit before returning results to the client. Eliminating
ad-hoc transactions removes client stalls, reducing the need
for concurrency.
2.2 No Disk
Today, relatively low-end 1U servers can have up to 128
GB of RAM, which gives a data center rack of 40 servers
an aggregate RAM capacity of over 5 TB. Thus, a mod-
est amount of hardware can hold all but the largest OLTP
databases in memory, eliminating the need for disk access
during normal operation. This eliminates disk stalls, which
further reduces the need for concurrent transactions.
Traditional databases rely on disk to provide durability.
However, mission critical OLTP applications need high availability which means they use replicated systems. H-Store
takes advantage of replication for durability, as well as high
availability. A transaction is committed when it has been
received by k > 1 replicas, where k is a tuning parameter.
2.3 Partitioning
Without client and disk stalls, H-Store simply executes
transactions from beginning to completion in a single thread.
To take advantage of multiple physical machines and multiple CPUs, the data must be divided into separate partitions. Each partition executes transactions independently.
The challenge becomes dividing the application’s data so
that each transaction only accesses one partition. For many
OLTP applications, partitioning the application manually
is straightforward. For example, the TPC-C OLTP benchmark can be partitioned by warehouse so an average of 89%
of the transactions access a single partition. There is
evidence that developers already do this to scale their applications, and academic research provides some
approaches for automatically selecting a good partitioning
key. However, unless the partitioning scheme is
100% effective in making all transactions only access a single partition, then coordination across multiple partitions
for multi-partition transactions cause network stalls and executing a transaction to completion without stalls is not possible. In this paper, we focus on what the system should do
in this case.
3. EXECUTING TRANSACTIONS
In this section, we describe how our prototype executes
transactions. We begin by describing the components of our
system and the execution model. We then discuss how single
partition and multi-partition transactions are executed.