Samsung has announced a new prototype key-value SSD that is compatible with the first industry standard API for key-value storage devices. Earlier this year, the Object Drives working group of Storage Networking Industry Association (SNIA) published version 1.0 of the Key Value Storage API Specification. Samsung has added support for this new API to their ongoing key-value SSD project.

Most hard drives and SSDs expose their storage capacity through a block storage interface, where the drive stores blocks of a fixed size (typically 512 bytes or 4kB) and they are identified by Logical Block Addresses that are usually 48 or 64 bits. Key-value drives extend that model so that a drive can support variable-sized keys instead of fixed-sized LBAs, and variable-sized values instead of fixed 512B or 4kB blocks. This allows a key-value drive to be used more or less as a drop-in replacement for software key-value databases like RocksDB, and as a backend for applications built atop key-value databases.

Key-value SSDs have the potential to offload significant work from a server's CPUs when used to replace a software-based key-value database. More importantly, moving the key-value interface into the SSD itself means it can be tightly integrated with the SSD's flash translation layer, cutting out the overhead of emulating a block storage device and layering a variable-sized storage system on top of that. This means key-value SSDs can operate with much lower write amplification and higher performance than software key-value databases, with only one layer of garbage collection in the stack instead of one in the SSD and one in the database.

Samsung has been working on key-value SSDs for quite a while, and they have been publicly developing open-source software to support KV SSDs for over a year, including the basic libraries and drivers needed to access KV SSDs as well as a sample benchmarking tool and a Ceph backend. The prototype drives they have previously discussed have been based on their PM983 datacenter NVMe drives with TLC NAND, using custom firmware to enable the key-value interface. Those drives support key lengths from 4 to 255 bytes and value lengths up to 2MB, and it is likely that Samsung's new prototype is based on the same hardware platform and retains similar size limits.

Samsung's Platform Development Kit software for key-value SSDs originally supported their own software API, but now additionally supports the vendor-neutral SNIA standard API. The prototype drives are currently available for companies that are interested in developing software to use KV SSDs. Samsung's KV SSDs probably will not move from prototype status to being mass production products until after the corresponding key-value command set extension to NVMe is finalized, so that KV SSDs can be supported without needing a custom NVMe driver. The SNIA standard API for key-value drives is a high-level transport-agnostic API that can support drives using NVMe, SAS or SATA interfaces, but each of those protocols needs to be extended with key-value support.

Comments Locked


View All Comments

  • FunBunny2 - Thursday, September 5, 2019 - link

    RIP real, aka Relational, databases.
  • Billy Tallis - Thursday, September 5, 2019 - link

    They're still around, and always will be. But they're not the only game in town anymore, and most people understand that they're not always the best solution. Of course, there are also plenty of examples where people have gone overboard in rejecting relational databases.
  • shayne.oneill - Friday, September 6, 2019 - link

    "they're not always the best solution"

    As we've been (re)discovering however, there aren't a lot of use cases where relational dbs are not the best solution. Theres a good reason why non relational databases where mostly abandoned in the 80s and 90s.
  • Urthor - Saturday, September 7, 2019 - link

    People will keep saying relational databases are dying as long as they keep mistakenly associating the term "relational" with tabular data.

    In reality the principles behind relational databases are identical whether your data is stored in tables of objects or documents or parsable data, but people seem to think the term relational means SQL '92
  • submux - Saturday, September 7, 2019 - link

    I think the most important factor is that there are far too many cases where relational DBs are the wrong "only answer" where they are abused. There is a massive amount of unstructured data which has been stored in blobs for ages. When developing an application such as for managing a doctors office, you would want to store structured patient data in a relational database, but you'd want to store pictures, lab data (genetic data for example), xrays, etc... in an object store.

    We don't need to do either/or and from an architectural perspective, running an SQL front-end on an object back end makes sense if only because most SQL servers have scalability issues. Legacy SQL ISAMs tend to shard very narrowly. By using Mongo, Couch, whatever, you can often scale out across many nodes. This is great for performance (maybe not for sync, but for query at least), and it makes it possible to simplify backup by keeping all data in a single data store.

    There is a holy grail to be discovered and that's the means of making a great scalable database which handles structured and unstructured data equally well. This could be an SQL server with amazing blob handling on a NoSQL server with amazing query processing... or maybe a hybrid. It should be able to be deployed using Docker or K8S without any complex replication junk and it should scale almost linearly with each additional added node.

    I think this will happen soon. And these drives may make it more interesting if an object store ends up being how we can accelerate it.
  • ElishaBentzi - Thursday, September 5, 2019 - link

    yes, is another mail to kill the databases, key - value, hash tables is almost all we need.
  • bcronce - Thursday, September 5, 2019 - link

    Key value datastore is almost all you need [for non relational data].

    Why do I say for non-relation data? Because humans, being humans, will make mistake after mistake if relational consistency is not enforced. Unless you have a simplistic use case or are dealing with the top 0.1% of programmers, you're going to have people corrupting your data.

    But but.. We don't have that issue! You say. ehhhh..... not in my experience. People who think they're not corrupting data seem to not know when they are. But we have all kinds of checks! you say. Yeah, but unless you're in the top 0.1%, you probably have the wrong checks or the checks are buggy and giving false positives.

    As someone who deals with data integration and loves RDBMSs, I generally load data into an RDBMS. And even from the biggest players in the industry, I will find invalid relations for data that spans years.

    Buggy software is much easier to fix that corrupt data.

    The real irony is many projects that don't want to use a RDBMS, run into many issues and start adding all kinds of business logic to their data layer, only to reinvent parts of a RDBMS, poorly.
  • bcronce - Thursday, September 5, 2019 - link

    Should be "false negative"
  • FunBunny2 - Thursday, September 5, 2019 - link

    "The real irony is many projects that don't want to use a RDBMS, run into many issues and start adding all kinds of business logic to their data layer, only to reinvent parts of a RDBMS, poorly."

    the crux of the issue: ever since IDMS & IMS reared their hierarchical heads, client-side coders have sought to sabotage transaction control in the datastore, which is what distinguishes a database from a datastore. client-side coders seek to extend their employment to infinity. current industrial strength RDBMS can handle petabyte storage, so it's not as if application only flat-files are necessary.
  • lkcl - Thursday, September 5, 2019 - link

    Lightning Memory Database (LMDB) is a transactional key-value store based on copy-on-write shmem (shared memory) file capabilities and uses a B+ Tree for data. the copy-on-write semantics means that unlike a standard key-value store which must copy the data, LMDB may give you a *direct* pointer to the actual data. consequently, performance for 100 MEGABYTE values is as good as performance for a 100 BYTE value.

    its transactional capabilities work by allowing a (single) writer to write an alternative B+ tree whilst multiple readers use the current B+tree, all of which is safe to do because the copy-on-write semantics isolate the writer entirely from the readers. when the transaction is complete, a new B+-tree "root" is created, which *only new readers* will be allowed to see (not the ones that currently have a read transaction in progress).

    it is extremely powerful and very obtuse code that took one (inexperienced) reviewer a YEAR to complete a full audit (he simply could not believe the listed capabilities of LMDB).

    interestingly - sadly - i doubt very much whether LMDB's capabilities could take advantage of Samsung's KV SSDs. also, do look up Howard Chu's comments on RocksDB and LevelDB. they are extremely informative (and not very complimentary, which, if you are going to have your business critically dependent on a key-value store, you need blunt, unfiltered, no-bullshit *technically correct* advice).

Log in

Don't have an account? Sign up now