SnoozyWalrus
SnoozyWalrus

Suggestions for tackling scaling issues

Hi all, currently working at startup and we have started to face scaling issues. Example Database tables have started increasing rows. So how do you handle this kind of data? How to think in the direction of improving and writing queries on billions of records to perform optimal? Can people with experience help me with any resources or personal experience they have gone through when going from 1 to 100? Minor of the minor suggestions would help. Grateful and thanks to all in advance.🙌

19mo ago
Jobs
One interview, 1000+ job opportunities
Take a 10-min AI interview to qualify for numerous real jobs auto-matched to your profile 🔑
+322 new users this month
FuzzyPickle
FuzzyPickle
Amazon19mo

Check all indices (indexes)

Find queries from your monitoring tool which are consuming maxiumum CPU. Run Explain/Analyze on those.

Move read queries to Read Replica and write queries to master.

Look for caching popular queries results.

Creare appropriate shards if needed.

Run batch jobs to archive stale/unused data from DB to s3

Try to move tables to NoSql (tables which doesn’t need any joins to start off with, ignore tables where you need to have transactional properties)

SnoozyWalrus
SnoozyWalrus

Thanks for the detailed answers We have applied indexes as well, but if suppose we apple where query on indexed attribute and that attribute is given array of 50k IDs, then indexes are also not use ful. So can you suggest any better way for this?

FuzzyPickle
FuzzyPickle
Amazon19mo

An alternative approach is to reconsider your data model. Depending on your use case, you might create a separate document for each ID or group IDs in a way that aligns with your query patterns. This can lead to more efficient queries and better scalability.

ZestyBagel
ZestyBagel
Swiggy19mo

Move to dynamodb from rds

FuzzyPickle
FuzzyPickle
Amazon19mo

Why choose dynamoDB as default NoSQL- it totally depends on the use case

ZestyBagel
ZestyBagel
Swiggy19mo

Yeah i mean nosql only dynamodb offers much better scalability imo

SnoozyPotato
SnoozyPotato

Which databases? What the current ingestion rate? What it will be in a year? Whats the current resource allocation. You expect ppl to answer without any of these. I can suggest a fancy system design which can support millions of read write but that doesn’t mean it can solve your problem. Each use case has its own solutions.

Ppl jumping in the thread without asking the constraints, if I was an interviewer would have rejected solely based on jumping to conclusions without asking finer details

SnoozyWalrus
SnoozyWalrus

Ok, so the database is MySQL. Ingestion rate is not known to me. It is more read heavy system as we have a no SQL db where we store computed data. So if specific table is updated and suppose this table impact 5 to 6 tables of no SQL then too much read will happen to compute data. And also tables in MySQL have started growing due to more clients coming. So on some tables basic where query is stating to take time. So how can I grow this system?

SnoozyPotato
SnoozyPotato

Get slow queries logging enabled, once you have that use explain analyse to see if its doing seq/table scan. If thats the case the fields which are most queried on and the where conditions column create indexes on those columns, re-run the explain thing validate of its doing a index scan now. This is the starting point. You can further denormalise the tables if its possible

ZippyWalrus
ZippyWalrus

Shard/read replica, index, partition Note: indexing will slow down writes Some types of indexes are faster than others

-I worked on a very large scale data warehouse at a financial institution

WigglyMuffin
WigglyMuffin

I am sure you will be able to find it on Google. Just deep dive before asking here. It will help you in the long run.

Discover more
Curated from across
IT Company Discussion
by WobblyWaffleSenior Software Engineer

Size of database and choice of databse

Hi all,

Wanted to know the size of database in your respective orgs and choice of db. Also pls mention any challanges faced while scaling it

Mine Main db around 3 terra bytes ( no sharding so on a single node with replication) Choice ...

Software Engineers
by CosmicCoconutCarestack

Backend stack to prepare for

Before you go about how "stack isn't important, skill is", the purpose of this post to understand what the current market trend is for backend. A bit about me, I'm a backend dev with experience in Node.js, MySQL and MongoDB. Around 4 Yo...

Software Engineers
by GoofyBagelWalmart

How do you solve for peak concurrency? >10,000 users at one point?

Just how do you scale? Do you use AWS auto-scaler for that?

Top comments
user

Seems like you are early in your DevOps/BE journey. The way you do it is: 1. Build performant APIs. <300ms is good a...

user

https://www.youtube.com/watch?v=9b7HNzBB3OQ This is the best video on an Indian company handling insane concurrency,...

user

From what I remember hotstar used an in-house tooling that scales infra based on request rate and concurrent users pe...

Software Engineers
by QuirkyNoodleDeloitte

Elite Devs, plz help!?🙂

Hey Everyone,

I am a 3YoE backend developer with little AI role and web development experience in the industry and I wanted to grow my career as more like an ✨Elite Engineer✨ who not just builds stuff but optimizes it to the core, overs...