There has been a strange disconnect in one sector of IT that’s transitioning into the blockchain world, and I am not sure anyone has noticed much until now.

The problem being that an increasing amount of DApps need to reference complex datasets to perform optimally, and it’s this amount of data that has to be pushed through the chain that slows things down and prevents scaling. The DApps themselves are generally small in terms of data throughput, so no speed and scaling issues there, but the addition of complex datasets and the compression/validation/access issues involved make things much slower.

This is a set of problems that seems to affect many projects in this space – examples being Factom, Tierion and quite a few others – namely that when they start transitioning the accompanying datasets through a blockchain environment then everything slows down dramatically.

Basically, the problems are threefold:

  1. How do you provide effective functionality with complex (larger) datasets with accompanying validation and security?
  2. How can you then scale to provide large transactional throughput to drive adoption and viability?
  3. How do you address the referencing (getting the RIGHT records/datasets – you don’t need the whole block) and compress effectively?

Well, someone seems to have come up with an effective approach to these issues and we are here to have a look at it – say hello to PepperDB:

pep1

The issues – according to PepperDB

From the whitepaper:

pep3

As you can see, these are serious issues that will prevent adoption and development in this space – which is bad news and urgently needs to be addressed.

The good news is that PepperDB claims to have a working, tested solution, and they have even published the results of their compression engine in the whitepaper; note that few projects are comfortable enough to explicitly put information like this out into the public domain, especially at such an early stage for a project. This tells me they know what they are doing and have built something worthy of serious consideration. Also, as well as a high-quality tested and working compression engine, they can address the other points given in the quote above effectively.

But ‘why’ implement databases on the blockchain in light of the existing situation? This tells you why – if you can decentralize the benefits of the existing situation with our centralized DB market, then you can cut out the middleman and take back control of your data:

pep4

For example, theirs will be a ‘multi-database schema protocol’ – which allows developers to build using their existing database stacks, which is a very big deal in terms of driving ease-of-use and speed of adoption, and they will also provide multiple SDKs for different programming languages, which reinforces this even further.

Their search and compression engine is called TerarkDB and can search for a specific record in a compressed dataset (which is compressed to one file), rather than having to decompress the entire file to search and retrieve – an ingenious and powerful approach for dealing with this issue. What’s more it is tested (results are published in the WP) and it is already in production use with Alibaba Cloud.

They claim that this can potentially improve blockchain storage capacity by factor of 10-100x. Details can be found in their whitepaper, which is well-written and clear – even to people who aren’t quite so technical.

pep5

YC is a well known VC company. But did you know about Daohe Capital?

To get around the problem of processing large amounts of data on-chain, which slows things considerably, they have taken the approach of separating out the storage element and processing that off-chain. Only transactional and contract data is processed on-chain. 

This provides an elegant workaround for this issue and enables significant scaling capabilities by greatly reducing the potential computing workload for these processes. File storage is decentralized and low-level (like Storj/IPFS) and decentralized into multiple nodes – an algo for proof-of-storage and token deposits guarantee security and availability, and the nodes are rated based on their cumulative ongoing time online.

Details of how it all works are in section 4.0 dealing with core technologies and are clear and fairly straightforward, unlike many WPs (IOTA – are you listening???) so feel free to take a look since it would be hard to paraphrase to make clearer – that’s how well the WP was written!

However, there is one interesting thing to note re their proprietary compression algorithm, namely that they claim speeds of up to 200x faster than the ones developed by Facebook and Google.

pep6

Additionally, their method of Index search, CO-Index is very well-designed, using Succinct Data Structure and Nested Patricia Trie (they also nested the leaves), which results in high compression and multiple times less memory (32 times?)– check it out if you are so inclined.

https://en.wikipedia.org/wiki/Succinct_data_structure

https://github.com/ethereum/wiki/wiki/Patricia-Tree

pep7pep8

As I stated earlier, PepperDB provides extensive tested and validated stats re the capabilities of their Terark engine and these can be found at the end of section 4 of the WP (they are impressive too!).

Marketing/Positioning

The marketing and target market focus, and the strategies to address those, are clearly and concisely expressed so I will just put them here as they did (if it isn’t broken, no need to rewrite/fix it!)

pep9

All are bold claims, but there is no ambiguity and the focus is clear. The focus on going for development teams is a smart one – functionality and engagement will drive adoption, and when combined with the zero-fee transaction model and the excellent compression ratios will push down costs for storage and service significantly.

In time this will position PepperDB very competitively as a storage platform versus AWS and others, especially if they can sign a ‘name’ like Twitter or Reddit as they indicate.

Possible sectors where they will have an impact include:

  1. BBS/Online communities
  2. News/Information portals
  3. Publication resources/Literature etc.
  4. Enterprise Net Disk/backup
  5. Personal Net Disk/backup
  6. Digital contract storage/services etc.
  7. Decentralized Advertising platforms
  8. C2C Trading platforms

 

Token Economics 

Circulating supply = 1,000,000,000

pep10

In the foundation part, most of the tokens will be used to incentivize developers and community – examples are that PepperDB will pay for the DApp storage in the store, whatever it takes to drive the platform forward.

To prevent a quick cash grab, the 30% ‘Team motivation’ token allocation has a 3 year lockup period.

Token Ecosystem

pep11

The approach seems very sensible – note especially the non-provision for mining, the decentralized aspects of the platform and the randomized node selection for transaction validation and consensus.

Team 

Technically stacked for ability and experience, in terms of advisors and developers, and verified by the usual Linkedin profiles – take a look:

pep12pep13

Also, the addition of close connections with Y-Combinator and Alibaba Cloud could help them greatly in terms of business management resources and expertise.

Roadmap

pep14

Private sale – June

Design finalization – August

Testnet launch/public sale/exchange listings – December

Mainnet and DApp developer community drive – Q1 2019

DApp launch (stage 1) and DApp store launch

ICO Cap – $30M

As you can see, these are still early days for this project but all the essential elements are there. Working deployed code with the Terark engine, strong team, good positioning and focus, clear communication and focus and strong backing for development and launch; and the ICO cap is in a good place with the $30M requirement.

This is one that I will be watching closely.