Results 1 to 5 of 5
  1. #1
    Join Date
    Jun 2011
    Posts
    2

    Unanswered: Can't find my performance bottleneck

    Hi,

    I'm using Postgres 8.4 and I have a lot of tables that have this basic schema:

    Code:
    CREATE TABLE Table1
    (
      identifier character varying(32) NOT NULL,
      virtual_location numeric(4,0),
      counter1 numeric(3,0),
      counter2 numeric(3,0),
      counter3 numeric(3,0),
      ...
      ...
    );
    I'm querying these tables with something like this over JDBC:

    Code:
    SELECT T.* FROM Table1 T WHERE virtual_location >= ? and virtual_location < ?
    These are read-only tables indexed on virtual_location. I'm doing a few of these reads in parallel, no two reads access the same table. I've tested with different levels of parallellism.

    I have SSD drives in a SAN system with 10Gb connections between the client and server machine, and I've tested normal copy operations from the client machine over an NFS mount to the Postgres machine and I can get read speeds of >300MB/s. But when I run my queries dstat on the server machine tells me that net I/O is ~50MB/s and disc read I/O is ~50MB/s and the CPU's on both the client and server are mostly idle.

    Does anybody have an idea of where the bottleneck in this system is? Is Postgres doing so many small reads that the disc latency is actually the bottleneck and can something be done about this?

    Thanks!

  2. #2
    Join Date
    Mar 2011
    Posts
    27
    Quote Originally Posted by fdurden View Post
    Hi,

    I'm using Postgres 8.4 and I have a lot of tables that have this basic schema:

    Code:
    CREATE TABLE Table1
    (
      identifier character varying(32) NOT NULL,
      virtual_location numeric(4,0),
      counter1 numeric(3,0),
      counter2 numeric(3,0),
      counter3 numeric(3,0),
      ...
      ...
    );
    I'm querying these tables with something like this over JDBC:

    Code:
    SELECT T.* FROM Table1 T WHERE virtual_location >= ? and virtual_location < ?
    These are read-only tables indexed on virtual_location. I'm doing a few of these reads in parallel, no two reads access the same table. I've tested with different levels of parallellism.

    I have SSD drives in a SAN system with 10Gb connections between the client and server machine, and I've tested normal copy operations from the client machine over an NFS mount to the Postgres machine and I can get read speeds of >300MB/s. But when I run my queries dstat on the server machine tells me that net I/O is ~50MB/s and disc read I/O is ~50MB/s and the CPU's on both the client and server are mostly idle.

    Does anybody have an idea of where the bottleneck in this system is? Is Postgres doing so many small reads that the disc latency is actually the bottleneck and can something be done about this?

    Thanks!
    A normal file copy will perform better as that reads contiguous chunks of data from disk. For this query, PostgreSQL is probably scanning the index and then the table and will grab a single block at a time which is probably all over the place within the files. However, by running several of them in parallel against different tables it should scale up pretty well.

  3. #3
    Join Date
    Aug 2008
    Posts
    147
    Have you checked your Execution Plan ? Are there any clues ?
    ---------------------------------
    www.sqlserver-dba.com
    www.dba-db2.com

  4. #4
    Join Date
    Jun 2011
    Posts
    2
    Not much. In my test I actually set the values of virtual_location so that they matches every row, so the query returns the whole table. This is the query plan, and it naturally doesn't even use the index in this case:

    Code:
    "Seq Scan on Table1 t  (cost=0.00..39454.81 rows=986476 width=154) (actual time=0.017..412.861 rows=986654 loops=1)"
    "  Filter: ((virtual_location >= 0::numeric) AND (virtual_location < 1501::numeric))"
    "Total runtime: 460.608 ms"
    I'm still betting on disk latency. I tested with standard rotating disks in a RAID 5 config and it was about 33% slower. Postgres is reading too little at a time. But maybe using a relational database for this isn't optimal to begin with, I'll have to look at other options...

  5. #5
    Join Date
    May 2008
    Posts
    277
    Quote Originally Posted by fdurden View Post
    I've tested normal copy operations from the client machine over an NFS mount to the Postgres machine and I can get read speeds of >300MB/s. But when I run my queries dstat on the server machine tells me that net I/O is ~50MB/s and disc read I/O is ~50MB/s and the CPU's on both the client and server are mostly idle.
    This doesn't really tell us much. Is this a raw file copy vs running your SQL queries from within your Java app? Hardly a meaningful comparison, if this is the case. Have you played with the server configuration settings? Determined if the driver needs configuration? The documentation, as well as the PostgreSQL mailing lists, will probably be your best resources.
    Last edited by futurity; 06-20-11 at 15:19.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •