If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > PostgreSQL > Can't find my performance bottleneck

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 06-16-11, 05:01
fdurden fdurden is offline
Registered User
 
Join Date: Jun 2011
Posts: 2
Can't find my performance bottleneck

Hi,

I'm using Postgres 8.4 and I have a lot of tables that have this basic schema:

Code:
CREATE TABLE Table1
(
  identifier character varying(32) NOT NULL,
  virtual_location numeric(4,0),
  counter1 numeric(3,0),
  counter2 numeric(3,0),
  counter3 numeric(3,0),
  ...
  ...
);
I'm querying these tables with something like this over JDBC:

Code:
SELECT T.* FROM Table1 T WHERE virtual_location >= ? and virtual_location < ?
These are read-only tables indexed on virtual_location. I'm doing a few of these reads in parallel, no two reads access the same table. I've tested with different levels of parallellism.

I have SSD drives in a SAN system with 10Gb connections between the client and server machine, and I've tested normal copy operations from the client machine over an NFS mount to the Postgres machine and I can get read speeds of >300MB/s. But when I run my queries dstat on the server machine tells me that net I/O is ~50MB/s and disc read I/O is ~50MB/s and the CPU's on both the client and server are mostly idle.

Does anybody have an idea of where the bottleneck in this system is? Is Postgres doing so many small reads that the disc latency is actually the bottleneck and can something be done about this?

Thanks!
Reply With Quote
  #2 (permalink)  
Old 06-16-11, 19:19
someidiot someidiot is offline
Registered User
 
Join Date: Mar 2011
Posts: 27
Quote:
Originally Posted by fdurden View Post
Hi,

I'm using Postgres 8.4 and I have a lot of tables that have this basic schema:

Code:
CREATE TABLE Table1
(
  identifier character varying(32) NOT NULL,
  virtual_location numeric(4,0),
  counter1 numeric(3,0),
  counter2 numeric(3,0),
  counter3 numeric(3,0),
  ...
  ...
);
I'm querying these tables with something like this over JDBC:

Code:
SELECT T.* FROM Table1 T WHERE virtual_location >= ? and virtual_location < ?
These are read-only tables indexed on virtual_location. I'm doing a few of these reads in parallel, no two reads access the same table. I've tested with different levels of parallellism.

I have SSD drives in a SAN system with 10Gb connections between the client and server machine, and I've tested normal copy operations from the client machine over an NFS mount to the Postgres machine and I can get read speeds of >300MB/s. But when I run my queries dstat on the server machine tells me that net I/O is ~50MB/s and disc read I/O is ~50MB/s and the CPU's on both the client and server are mostly idle.

Does anybody have an idea of where the bottleneck in this system is? Is Postgres doing so many small reads that the disc latency is actually the bottleneck and can something be done about this?

Thanks!
A normal file copy will perform better as that reads contiguous chunks of data from disk. For this query, PostgreSQL is probably scanning the index and then the table and will grab a single block at a time which is probably all over the place within the files. However, by running several of them in parallel against different tables it should scale up pretty well.
Reply With Quote
  #3 (permalink)  
Old 06-19-11, 05:13
JackVamvas JackVamvas is offline
Registered User
 
Join Date: Aug 2008
Posts: 48
Have you checked your Execution Plan ? Are there any clues ?
__________________
---------------------------------
www.sqlserver-dba.com
www.dba-db2.com
Reply With Quote
  #4 (permalink)  
Old 06-20-11, 04:31
fdurden fdurden is offline
Registered User
 
Join Date: Jun 2011
Posts: 2
Not much. In my test I actually set the values of virtual_location so that they matches every row, so the query returns the whole table. This is the query plan, and it naturally doesn't even use the index in this case:

Code:
"Seq Scan on Table1 t  (cost=0.00..39454.81 rows=986476 width=154) (actual time=0.017..412.861 rows=986654 loops=1)"
"  Filter: ((virtual_location >= 0::numeric) AND (virtual_location < 1501::numeric))"
"Total runtime: 460.608 ms"
I'm still betting on disk latency. I tested with standard rotating disks in a RAID 5 config and it was about 33% slower. Postgres is reading too little at a time. But maybe using a relational database for this isn't optimal to begin with, I'll have to look at other options...
Reply With Quote
  #5 (permalink)  
Old 06-20-11, 14:15
futurity futurity is offline
Registered User
 
Join Date: May 2008
Posts: 270
Quote:
Originally Posted by fdurden View Post
I've tested normal copy operations from the client machine over an NFS mount to the Postgres machine and I can get read speeds of >300MB/s. But when I run my queries dstat on the server machine tells me that net I/O is ~50MB/s and disc read I/O is ~50MB/s and the CPU's on both the client and server are mostly idle.
This doesn't really tell us much. Is this a raw file copy vs running your SQL queries from within your Java app? Hardly a meaningful comparison, if this is the case. Have you played with the server configuration settings? Determined if the driver needs configuration? The documentation, as well as the PostgreSQL mailing lists, will probably be your best resources.

Last edited by futurity; 06-20-11 at 14:19.
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On