SnoozyWalrus

Suggestions for tackling scaling issues

Hi all, currently working at startup and we have started to face scaling issues. Example Database tables have started increasing rows. So how do you handle this kind of data? How to think in the direction of improving and writing queries on billions of records to perform optimal? Can people with experience help me with any resources or personal experience they have gone through when going from 1 to 100? Minor of the minor suggestions would help. Grateful and thanks to all in advance.🙌

19mo ago

Jobs

One interview, 1000+ job opportunities

Take a 10-min AI interview to qualify for numerous real jobs auto-matched to your profile 🔑

+322 new users this month

FuzzyPickle

Amazon19mo

Check all indices (indexes)

Find queries from your monitoring tool which are consuming maxiumum CPU. Run Explain/Analyze on those.

Move read queries to Read Replica and write queries to master.

Look for caching popular queries results.

Creare appropriate shards if needed.

Run batch jobs to archive stale/unused data from DB to s3

Try to move tables to NoSql (tables which doesn’t need any joins to start off with, ignore tables where you need to have transactional properties)

SnoozyWalrus

Stealth19mo

Thanks for the detailed answers We have applied indexes as well, but if suppose we apple where query on indexed attribute and that attribute is given array of 50k IDs, then indexes are also not use ful. So can you suggest any better way for this?

FuzzyPickle

Amazon19mo

An alternative approach is to reconsider your data model. Depending on your use case, you might create a separate document for each ID or group IDs in a way that aligns with your query patterns. This can lead to more efficient queries and better scalability.

ZestyBagel

Swiggy19mo

Move to dynamodb from rds

FuzzyPickle

Amazon19mo

Why choose dynamoDB as default NoSQL- it totally depends on the use case

ZestyBagel

Swiggy19mo

Yeah i mean nosql only dynamodb offers much better scalability imo

SnoozyPotato

Stealth19mo

Which databases? What the current ingestion rate? What it will be in a year? Whats the current resource allocation. You expect ppl to answer without any of these. I can suggest a fancy system design which can support millions of read write but that doesn’t mean it can solve your problem. Each use case has its own solutions.

Ppl jumping in the thread without asking the constraints, if I was an interviewer would have rejected solely based on jumping to conclusions without asking finer details

SnoozyWalrus

Stealth19mo

Ok, so the database is MySQL. Ingestion rate is not known to me. It is more read heavy system as we have a no SQL db where we store computed data. So if specific table is updated and suppose this table impact 5 to 6 tables of no SQL then too much read will happen to compute data. And also tables in MySQL have started growing due to more clients coming. So on some tables basic where query is stating to take time. So how can I grow this system?

SnoozyPotato

Stealth19mo

Get slow queries logging enabled, once you have that use explain analyse to see if its doing seq/table scan. If thats the case the fields which are most queried on and the where conditions column create indexes on those columns, re-run the explain thing validate of its doing a index scan now. This is the starting point. You can further denormalise the tables if its possible

ZippyWalrus

Stealth19mo

Shard/read replica, index, partition Note: indexing will slow down writes Some types of indexes are faster than others

-I worked on a very large scale data warehouse at a financial institution

WigglyMuffin

Stealth19mo

I am sure you will be able to find it on Google. Just deep dive before asking here. It will help you in the long run.

Discover more

Curated from across

IT Company Discussion16mo

by WobblyWaffleSenior Software Engineer

Size of database and choice of databse

Hi all,

Wanted to know the size of database in your respective orgs and choice of db. Also pls mention any challanges faced while scaling it

Mine Main db around 3 terra bytes ( no sharding so on a single node with replication) Choice ...

Software Engineers18mo

by CosmicCoconutCarestack

Backend stack to prepare for

Before you go about how "stack isn't important, skill is", the purpose of this post to understand what the current market trend is for backend. A bit about me, I'm a backend dev with experience in Node.js, MySQL and MongoDB. Around 4 Yo...

Software Engineers14mo

by GoofyBagelWalmart

How do you solve for peak concurrency? >10,000 users at one point?

Just how do you scale? Do you use AWS auto-scaler for that?

Top comments

Seems like you are early in your DevOps/BE journey. The way you do it is: 1. Build performant APIs. <300ms is good a...

https://www.youtube.com/watch?v=9b7HNzBB3OQ This is the best video on an Indian company handling insane concurrency,...

From what I remember hotstar used an in-house tooling that scales infra based on request rate and concurrent users pe...

Software Engineers20mo

by QuirkyNoodleDeloitte

Elite Devs, plz help!?🙂

Hey Everyone,

I am a 3YoE backend developer with little AI role and web development experience in the industry and I wanted to grow my career as more like an ✨Elite Engineer✨ who not just builds stuff but optimizes it to the core, overs...

8245 interviewed this week

Hire Top Engineers

Round1 finds you 10 great candidates in 48 hours!

Get in touch

23 roles listed this week

Apply to top roles

Take a 9-minute interview and auto-apply to top companies!

Apply now

Ask a question on Grapevine.

Get the app on Android or iOS.

Privacy Terms

Guidelines Help