Data structure in the context of a decent search engine

A simple look at structuring notes and notebooks in the context of a decent search engine and hash-tags

NoterBox has a straight-forward notes-notebooks structure. There are notebooks, and each notebook can hold notes. Nothing simpler than that. And that is by design.

NoterBox Notebooks

Back then when I started working on NoterBox, I was very much into having a nested notebooks structure. Later on, as I made progress with the development and had a basic (full text database) search engine added, I’ve quickly realized that I can basically find any stored info by simply searching (of course).

Digging into notebooks and sub-sub-sub-notebooks to find data, was probably not the way to go. I know that this is highly debatable and I guess that there are many that would prefer the nested sort of data structure. My problem was that I also had to search attachments and attachments content (where possible). We’ll talk about that later, below.

So the nested notebooks structure…

  Personal Taxes notebook/
  ├── 2014 notebook/
  │   ├── turbotax note
  │   └── return note
  ├── 2015 notebook/
  │   ├── turbotax note
  │   └── return note
  └── 2016 notebook/
      ├── turbotax note
      └── return note

… becomes a simple “notebook with notes” data structure:

  Personal Taxes notebook/
  ├── 2014 note/
  │   ├── turbotax attachment/
  │   └── return attachment/
  └── 2015 note/
  │   ├── turbotax attachment/
  │   └── return attachment/
  └── 2016 note/
      ├── turbotax attachment/
      └── return attachment/

At this point, for me the decision was made: switch to a “notebook with notes” data structure, and start focusing on integrating a better search engine, as well as making it easier for the user to search an display results. What does that mean?

  • Search everything & search by collections:

    • First, I wanted to be able to search everything, but also by collections. For example, only search within notes, notebooks or attachments.
  • Search operators

    • I needed to be able to find data using search operators. Like so: attachments: "foo" -bar. This and the above are both handled by Elasticsearch.
  • Suggestion engine

    • I also needed a suggestion engine, and the Typeahead.js / Bloodhound combination was perfect. That will filter results as you type your search. So before hitting Enter you already have a set of narrowed data returned which you can highlight and pick - at that stage. Neato.
  • Displaying results

    • At first I thought that this would be an easy one. It turned out not to be. I needed a view for everyting (all results, no matter the type) and than some additional separate views for only showing collections.
    • While this worked, when put into practice and tested, seemed confusing and cluttered. After testing with different layouts, and putting myself in user’s shoes, I’ve settled for displaying collections only, with data that has most chances of being searched, displayed first. And that data is Notes.
    • Secondly, I would have Notebooks and than Attachments displayed as separate collections. It’s hard to take decisions as single user, but that’s where I am right now and I have to move on.
  • Search inside attachments

    • Right now I am working on implementing search inside attachments. The results will also go in the Attachments collection. I am aware that not all attachments are searchable, but a lot of them are. From text files, documents, PDF’s to images. So yes, I want that.
  • Hash-tags

    • Ok, stepped back and took a look. Something was still missing. I needed an additional way to interlink and group notes, besides Notebooks.
    • I have decided to go with hash-tags - that would fit the bill. You can create hash-tags on the fly in any note, and that will inter-link notes across different notebooks.

NoterBox Hashtag

  • The hash-tags view will give you a global outlook of your existing hash-tags. Hash-tags are also click-able and searchable.

Need to find the needle in the haystack? Knock yourself out.. 😎 By all means - this is not the end, but a starting point. Like mentioned, from a user’s standpoint of view, this approach may not be for everyone - but I believe it may work for majority. In terms of usage, the sort of functionality I am targeting here is:

If you’re an organized person, that’s good. If not, that’s fine too. Just throw information in NoterBox. You’ll still find your data easily.

I am taking a wild guess here, you have to start somewhere. I will be looking forward to receiving feedback from beta users.

NoterBox it’s not yet a finished product, and at this stage, nothing is “written in stone”. I would love to hear your opinion on structuring Notes & Notebooks in the context of a search engine and hash-tags. You can give me a shout on Twitter at @NoterBox

Come work with us. NoterBox is currently looking for a Technical Co-founder. Also, you can subscribe below for a notification when NoterBox will be available.

Published by in data-structures, general and noterbox and tagged attachments, has-tags, notebooks and notes using 789 words.