REQ: Thoughts on +1 million record db design.  
Author Message
Marquis





PostPosted: Thu Dec 04 00:23:35 CST 2003 Top

SQL Server Developer >> REQ: Thoughts on +1 million record db design.

Hello,

The largest database I've designed and built had couple tables of 100,000
records each, and it ran well. Now I have been presented with a challenge,
I've been asked to build a database that will start with several hundred
thousand records, grow to over a million in about a year and then keep
growing.

Your thoughts on what to consider, look out for and the like are greatly
appreciated.

The database will:
- have an asp.net interface
- be used by the general Internet public and will have a good deal of
traffic
- it's contents are text-only articles each about two pages long
- the articles will be organized using two models, meaning the presentation
layer needs to display the same article in one or both models at the
presentation layer
a: by category, much like http://www.hide-link.com/
b: by news topic (headlines, international, sports, etc.)
- when an article is posted to a category an email alert will be sent to
users who subscribed to the category.
- full-text search need for all the articles.
- about 1500 new articles a day will be added
- old articles will remain accessible

In addition to your strategic comments, here's a couple tactical questions
we can start the conversation with:
- should I put all the articles in one massive table and then use fk in
other tables to organize it?
- should thenew article inserts go into a table isn't being read by the
public and then do a mass import into the "live" article table every hour or
so?

TIA,

Rick

P.S. I really appreciate this group's timely and thoughtful responses.






---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system ( http://www.hide-link.com/ ).
Version: 6.0.547 / Virus Database: 340 - Release Date: 12/2/2003

SQL Server190  
 
 
David





PostPosted: Thu Dec 04 00:23:35 CST 2003 Top

SQL Server Developer >> REQ: Thoughts on +1 million record db design.


> Hello,
>
> The largest database I've designed and built had couple tables of 100,000
> records each, and it ran well. Now I have been presented with a challenge,
> I've been asked to build a database that will start with several hundred
> thousand records, grow to over a million in about a year and then keep
> growing.
>
> Your thoughts on what to consider, look out for and the like are greatly
> appreciated.
>
> The database will:
> - have an asp.net interface
> - be used by the general Internet public and will have a good deal of
> traffic
> - it's contents are text-only articles each about two pages long
> - the articles will be organized using two models, meaning the
presentation
> layer needs to display the same article in one or both models at the
> presentation layer
> a: by category, much like http://www.dmoz.org/
> b: by news topic (headlines, international, sports, etc.)
> - when an article is posted to a category an email alert will be sent to
> users who subscribed to the category.
> - full-text search need for all the articles.
> - about 1500 new articles a day will be added
> - old articles will remain accessible
>

First comment isn't about SQL at all. Consider usinng ASP.NET caching.
They really built a great caching scheme into the framework so you can have
a site full of slowly-changing, expensive-to-build pages and just "turn on"
caching. Your pages will be built once and beamed over and over to the
clients without hitting SQL or running a single line of VB code. Short of
caching the whole page, you can cache objects in the middle tier to reduce
the load on your web server and database, saving those resources for your
truly dynamic pages.

> In addition to your strategic comments, here's a couple tactical questions
> we can start the conversation with:
> - should I put all the articles in one massive table and then use fk in
> other tables to organize it?

One big table. A couple million records really isn't that big.
And remember if you store the text in a TEXT column, the text is not stored
in the table. The table just has a 16 byte id for it. So if you use a TEXT
column for the body, you can still have a pretty skinny table. Keeping it
skinny is the key to searching it fast.

Along these lines, if you want to extract the first 1000 characters or so
for a summary, that should go in a seperate table with a FK. That would
make your table fat.

> - should thenew article inserts go into a table isn't being read by the
> public and then do a mass import into the "live" article table every hour
or
> so?

1500 articles a day really isn't that much. Just stick 'em in.

David