Prisma with PostgreSQL: the hard things about building a no code database
Introduction
Building Cycle has been a challenging journey. We initially built the product, then rebuilt it a year later, and finally built it again the year after that. It took us a total of two and a half years to establish a solid foundation that we could be proud of. However, why was it so complicated to create a seemingly simple product? What were the requirements that made developing a tool for product teams so complex? Luckily, we were naive enough to jump in and believe that everything would go smoothly!
The constraints
Every product team has its own unique typology, with its own process, concepts, and way of structuring product information. From the very beginning, it was a requirement to build a framework that could adapt to these different typologies. Some teams prefer to create bugs and features, while others prefer to create problems and solutions. Some teams even combine all of these elements and link them together with a well-thought-out hierarchical structure. This left us no choice but to build a no-code relational database. What could go wrong, right? 😅
Cycle's basic entities are simple. The "Doc
" is the atomic unit. The "Doctype
" simply refers to the type of Doc
you want to create (e.g., a bug, a feature, etc.). You can also add hierarchy constraints between your Doctypes
. "Properties
" are custom links with Doctypes
and allow you to decide which values can be assigned to a Doc
. Lastly, there is the "View
," which is the way to visualize your Docs
. You can actually make a comparison with a relational database, where Doctypes
are Tables, Docs
are Rows, Properties
are Fields, and Views
are, well... Views.
The hard thing about pagination
In Cycle, a Doc can be shared across any View without duplicating it. It means every View can potentially contain all the Docs of a workspace. If a workspace has +100K Docs, a view configuration that says « I want all the Docs from all Doctypes » (or select * from all_tables
where all_tables
would be a join on each table in SQL) will request +100K Docs.
We cannot afford to have the same approach as most companies who load all the ids then handle pagination from the front end, the load times would be terrible. Specifically in the case of grouped by views (group by
clause in an SQL query). This is what we initially did in the past and it forced us to rewrite the implementation. That’s why the pagination is made backend side: the first view query only requests 8 groups and 15 docs by group. The pagination is then handled group by group; we query 15 more docs for the group we scroll in but the frontend is agnostic of those docs.
This is only made possible via an existing entity in the database for each and every group. And every group by can be created out of any property in the workspace. This adds some complexity when building table stakes features like sorts or multi group by but this is the best way to scale with our constraints.
Hereunder, a comparison of the different fetching strategies from flexible management tools. As stated, this works well in their context where tasks, pages or issues live in the context of a single view/database/board but not for ours where docs are global.
- Notion
- Inital query:
queryCollection
query fetches all the ids of the pages contained in the Board and all the groups - Group query: Groups are built out of the already fetched data. The processing time being simply the client rendering the groups.
- Inital query:
- Clickup
- Initial query:
genericView
query fetches all the ids of the tasks contained in the Board and all the groups. - Group query: Every group knows which task is part of the group and fetches the task when scrolling in the group.
- Initial query:
- Trello
- Initial query:
cards=visible
query parameters fetches all the cards with all the informations. - Group query: No group query, the groups are built out of the first query and scrolling does not trigger any query or socket.
- Initial query:
- Asana
- Initial query: The initial data is fetched over a single
synced
socket - Group query: The group query is only triggered for builtin property (like status). For custom properties everything is fetched and groups are built out of the first query.
- Initial query: The initial data is fetched over a single
The hard thing about view configuration (filters, group by, …)
When first building Cycle, we weren’t quite sure of the data structure regarding Views which led us to store the information in a JSON format. This was a mistake because everything in Cycle is relational and deeply linked. Leveraging the relational features of a PostgreSQL database makes much more sense (as for most B2B productivity softwares). The migration from an unstructured json format was hard but necessary.
In Cycle, any property can be used in a filter which implies the value of a filter to be linked to the value assigned to a Doc, itself defined in the Property settings. This is made possible via a generic Value
entity in the database. Anything can be a Value
. From a Doc to a User, from a Customer to a Doctype or a simple Single select value. This Value
table being so centric, it is important to adopt a proper indexing strategy as explained in this previous article (you may suffer from latency leading to database timeouts otherwise).
The hard thing about separation of concerns
Reflecting on our journey, I have thought many times about the reasons we couldn’t build the proper foundation from the very beginning. There are many reasons, including the inherent complexity and the typical iterations when starting a business. However a less obvious one hit me when we rebuilt the product the last time.
At the very beginning of Cycle, everyone was full-stack. But what does full-stack actually mean ? Does it mean you’re as good in frontend as in backend ? Do you have the same expertise as a Senior frontend engineer and a Senior backend engineer ? If not why are you claiming yourself full-stack ? Trust me I fully understand the full-stack vibe and what it means in term of culture but when building a product where every interaction must be optimised there is no room for lack of expertise and you’ll iterate way faster if there is a proper separation of concerns (even in a small team).
Cycle is an API first product. It took us some time to embrace this philosophy but it was game changer. How did it impact the way we work ? Well, as stated by Conway's law, the structure of a product reflects the structure of its organization. Therefore, we structured the team in a way that aligns with the desired product outcome. The backend team aligns on the DB schema and the API schema serves as a way to align the frontends with the backends. The two schemas being clearly decoupled.
The hard things made easier
Cycle’s data model is the essence of the product. It was hard to build but if you want to adopt the same approach, here is a visual representation to start from :
And its prisma schema
Feel free to build whatever you want from this abstraction and to get in touch if you have any question (martin@cycle.app). To see it in action, you can access our API directly from our documentation.