If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > MySQL > Datatype for Status column of a large table

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 02-18-06, 22:55
screenmates screenmates is offline
Registered User
 
Join Date: Jul 2005
Posts: 63
Datatype for Status column of a large table

On a high-traffic website with millions of rows of data, which is more ideal for the status flag of rows?

tiny int: 0 - inactive, 1 - active, 2 - pending (NOTE: where status > 0)
char: I - inactive, A - active, P - pending (NOTE: where status != 'I')

There are a couple of more flags (overall not more than 5) but which datatype is more ideal in terms of performance, readability, etc?

If the table is large with over a million rows, do i have to create index on status column that has no more than 5 unique values? The status column is used in all queries, though.


Further, I used to work on Oracle and Sybase before where I used WHERE clauses as under:

where p.ProductId = invoice.ProductId


but MySQL seems to have a different way of joining tables:

FROM orders o INNER JOIN customer c
ON c.customer_id = o.customer_id

which is a bit hard for me to understand. However, I find that the Oracle-style where clause works fine in mysql. so, is it ok to use Oracle style syntax or should I learn the new mySql syntax?

Further, could someone give me some hints on the 'driving table' on how to order the tables in the FROM clause?

thanks,

Last edited by screenmates; 02-18-06 at 23:06.
Reply With Quote
  #2 (permalink)  
Old 02-19-06, 07:14
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,534
Quote:
Originally Posted by screenmates
On a high-traffic website with millions of rows of data, which is more ideal for the status flag of rows?

tiny int: 0 - inactive, 1 - active, 2 - pending (NOTE: where status > 0)
char: I - inactive, A - active, P - pending (NOTE: where status != 'I')
the tinyint

Quote:
Originally Posted by screenmates
If the table is large with over a million rows, do i have to create index on status column that has no more than 5 unique values?
not a separate index, no, but you might want to add the status column as a secondary column on other indexes

Quote:
Originally Posted by screenmates
but MySQL seems to have a different way of joining tables:

FROM orders o INNER JOIN customer c
ON c.customer_id = o.customer_id
that is actually easier to understand, not harder

plus, it is the standard sql way of joining tables

by the way, both oracle and sybase -- as well as all other databases -- also support the JOIN syntax

Quote:
Originally Posted by screenmates
Further, could someone give me some hints on the 'driving table' on how to order the tables in the FROM clause?
don't worry about this, as the optimizer does this job for you

however, if you mix INNER and OUTER joins, i have found that it helps comprehension to start the FROM clause with the table(s) that have filter conditions in the WHERE clause, and whose rows you want all of, and then attach first INNER and then LEFT OUTER JOIN tables one at a time
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book

Last edited by r937; 02-19-06 at 07:17.
Reply With Quote
  #3 (permalink)  
Old 02-19-06, 09:59
screenmates screenmates is offline
Registered User
 
Join Date: Jul 2005
Posts: 63
Quote:
Originally Posted by r937
the tinyint
not a separate index, no, but you might want to add the status column as a secondary column on other indexes
If my table has ID, Description and Status columns and the select statement is:

select ID, Description from... where Status > 0

will the index be on ID+Status ?

ID is the 'primary key', Status does not have to be in the result set and Description does not have to be indexed.
[/QUOTE]


Quote:
Originally Posted by r937
that is actually easier to understand, not harder

plus, it is the standard sql way of joining tables

by the way, both oracle and sybase -- as well as all other databases -- also support the JOIN syntax
A bit confused... Did you say this is the standard sql:

FROM orders o INNER JOIN customer c
ON c.customer_id = o.customer_id

Will the optimizer have a problem if I use:

where c.customer_id = o.customer_id

Quote:
Originally Posted by r937
don't worry about this, as the optimizer does this job for you

however, if you mix INNER and OUTER joins, i have found that it helps comprehension to start the FROM clause with the table(s) that have filter conditions in the WHERE clause, and whose rows you want all of, and then attach first INNER and then LEFT OUTER JOIN tables one at a time
Is there anything i should do to reduce burden on the optimizer since it is a high traffic site?

Thanks much...
Reply With Quote
  #4 (permalink)  
Old 02-19-06, 10:04
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,534
Quote:
Originally Posted by screenmates
If my table has ID, Description and Status columns and the select statement is:

select ID, Description from... where Status > 0

will the index be on ID+Status ?
no, because that index won't be used -- you're asking for all rows with status>0 and that will definitely use a table scan


Quote:
Originally Posted by screenmates
Did you say this is the standard sql:

FROM orders o INNER JOIN customer c
ON c.customer_id = o.customer_id

Will the optimizer have a problem if I use:

where c.customer_id = o.customer_id
yes it is, and no it won't, but you do realize that your WHERE condition will be completely redundant to the ON condition, right?

Quote:
Originally Posted by screenmates
Is there anything i should do to reduce burden on the optimizer since it is a high traffic site?
use stored procedures (which require the optimizer only at time of declaration) instead of queries (which require the optimizer on each call)
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #5 (permalink)  
Old 02-19-06, 10:48
screenmates screenmates is offline
Registered User
 
Join Date: Jul 2005
Posts: 63
Quote:
Originally Posted by r937
no, because that index won't be used -- you're asking for all rows with status>0 and that will definitely use a table scan
If i create an index on Status column and use:

where Status > 0

will it use the index? Is index required for 4-5 unique values of a column?

Quote:
Originally Posted by r937
but you do realize that your WHERE condition will be completely redundant to the ON condition, right?
Is the standard standard sql the recommended way to go?

Quote:
Originally Posted by r937
use stored procedures (which require the optimizer only at time of declaration) instead of queries (which require the optimizer on each call)
That's a great piece of advice!

Thanks...
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On