Newbie design questions about DB and searching

aws

#1

I am interested in serverless and would like to evaluate it a little (based on AWS services).

Maybe even as replacement for an existing application (nodejs backend with a NoSql DB (ArangoDb which is multimodel document / graph DB), and an Angular Frontend) which is in the works.

My main concern is the database and especially “search” related. In my current infrastructure I am able to search the DB e.g. “full text” or by geo data (items which are within a radius) etc. These features are necessary for the frontend.

So the DB candidate would be DynamoDB, right?

Now my question is how do you guys handle the search “subject”?
Do you use only the DB or something different? (AWS Elasticsearch etc.). When I go in the serverless direction I don’t want to manage the search solution by myself…

Because of it is a little startup the solution should be also rather cheap…


#2

If you want to use AWS, then yes, it would be logical to go with DynamoDB for storing data, and ElasticSearch to search data. Just keep in mind you do need to stream the data from DynamoDB to ElasticSearch using a custom Lambda. There used to be a blueprint available to get the job done but it was removed.


#3

I am curious if elasticsearch is the best option. As far as i understand, the aws service is not really managed. So you have to choose which ec2 instance you use and this service has to be always on… Is there anything comparable to lambda/dynamodb etc where only the real consumption has to be paid (and not too expensive…)


#4

I believe it is managed, as you don’t have to install or maintain anything yourself. But it is true you have to pay for every x amount of time you use it. If you don’t want this you could consider Cloudant(IBM Service). The free plan could perhaps work for your startup. You can check the pricing here: https://www.ibm.com/analytics/us/en/technology/cloud-data-services/cloudant/pricing/

You could even consider dropping DynamoDB completely then, or you could stream the DynamoDB records to Cloudant like you would with ElasticSearch. Both Cloudant and ElasticSearch are built on top of Lucene, so query capabilities will be similar.

Btw: did you actually calculate how much the costs will be when using Elasticsearch? Could be that they are still very low.


#5

Another popular solution is Algolia. It’s basically a managed search index where you pay for a certain number of operations rather than choosing an instance size.


#6

If you start exploring dynamoDB you will quickly be stuck at the “how do I query my data now” question.

I’m working on a personal implementation of a true serverless backend for my web app.

My pick is based on one of the first website who did serverless backend: dekki.com

They use Algolia for any kind of query.

For the rest, they rely on dynamoDB (when you just have to fetch your item by its key basically).

Algolia has a nice free tier that should be enough for a small personal project.

When you decide on your stack and start building it, let us know how it goes and what limitations you encounter.

Have a look at my personal project if that can give you some clue: https://github.com/Thommas/vgadb

My next step is to listen to dynamodb event and send the data to an Algolia index. Then I will be able to query my data by any field.


#7

@Thommas, in the Lambda interface in AWS a blueprint is available which can be used as a starting point to stream the data to Algolia(you would still have to map the DynamoDB stream events to the Algolia requests). I have used this to stream my data to ElasticSearch. Might be of help for you.


#8

Thanks, I will have a look.

you would still have to map the DynamoDB stream events to the Algolia requests

I was expecting to do this. I might want to not store every field of my objects in Algolia, only the fields I want to query.

I have used this to stream my data to ElasticSearch

Based on this article’s conclusion. Why did you rely on ElasticSearch and not Algolia for queries ?

Maybe you were using ElasticSearch for analytics or logs only ?

Edit: nvm, this article is too old : )


#9

Not the best reason, but a requirement for the project was to only use AWS services. If i had the choice to pick a different service i think i would have used Cloudant, because i have good experience with Cloudant. But Algolia sounds like a good pick as well :slight_smile: i guess all of them have atleast some pros and cons