Results 1 to 13 of 13
  1. #1
    Join Date
    Feb 2003
    Location
    Netherlands
    Posts
    63

    Unanswered: Postgresql having an endless hunger for memory

    Hey Everyone,

    I'm new to Postgressql so I hope you guys can help me out with something. I've been searching for an answer for awhile and all though I have found many memory related topics, they all seem to be targetted at postgresql not using enough memory. I have the opposite problem.

    I'm migrating a database I have from another non Postgessql database to Postgressql by selecting records from the one database, and then inserting them into tables I've created on the postgressql. Easy does it.

    What I'm noticing is that with every insert the progressql server is taking on more memory. Now at first glance this might be fine, if it keeps the data that has been inserted in memory in case it is reused, thats just great, I'm all for caching. But it just keeps adding and adding and adding...
    At some point my memory is full and Windows decides to start swapping memory to disk, also great, thats what the page file is for. But it still is hungry and adds and adds and adds until the page file is full and windows decides to enlarge the page file... Eventually the page file reaches a size of 8 or 9 Gb and my server finally crashes because the disk is full.

    It seems to never figure out the data in memory is no longer needed and should be released. Now what exactly this data is, I am unsure, we're talking about taking on over 10Gb of memory, but my source data is less then 1Gb.

    At this point I can't rule out that this is caused by a bad implementation of the client all though I have seen the same behavior both access postgressql through ODBC and through a native data access layer that was added to the tool we're using. Also we are encountering this while running the client software on the server, as well as on a separate client machine. Its always the server that starts using odd amounts of memory.
    Finally we have postgressql running on a Windows server (8.2.6) and on a mac (8.2.5) and both have the same problems.

    Has anybody got any clue what I should change or what I am doing wrong that causes this? So far all the memory settings I've looked at seem to indicate what should be used minimally. I did look into the write ahead log thing that would suggest that my problem could be related to not actually writing the data to disk but keeping it in memory, still it should flush that eventually. I've played around with the settings but I'm not sure if I've tried the right things.
    Greetz,

    Bastiaan Olij

  2. #2
    Join Date
    Jun 2004
    Location
    Arizona, USA
    Posts
    1,848
    This sounds more like an application that you use is misbehaving.

    What tool are you using to Select from one database and Insert into the PG database? Is it running as an app on the pg server, or on a different machine?

    Normally, you would save the source database or table as a text file. (Script the entire database, or individual tables)

    Then, execute the script.

    You could also recreate the tables, one at a time. Then, export the data from the source database to a csv file, and import the csv file into the appropriate pg table.

    Ref post #8 here
    Lou
    使大吃一惊
    "Lisa, in this house, we obey the laws of thermodynamics!" - Homer Simpson
    "I have my standards. They may be low, but I have them!" - Bette Middler
    "It's a book about a Spanish guy named Manual. You should read it." - Dilbert


  3. #3
    Join Date
    Feb 2003
    Location
    Netherlands
    Posts
    63
    Loquin,

    Valid points for sure, I've done it with exports and imports various times aswell. In this case inserting it this way seemed to be the easiest.

    The tool is called Omnis Studio, the source database is a native solution to this tool.

    I've ran this both using the tool on the server, as having the tool on a separate machine connecting to the server on the database. I've also used both ODBC as a native PostgresSQL addition written for this tool with the same results. The ODBC version connecting to another database (MS SQL in this case) does not result in these memory problems.
    Greetz,

    Bastiaan Olij

  4. #4
    Join Date
    Jun 2004
    Location
    Arizona, USA
    Posts
    1,848
    Well, Omnis appears to be an RAD application, and not a database migration utility...

    If your application is opening a new connection for each record or for each table that it transfers, you would tend to see PostgreSQL's memory usage increase substantially with each new connection... Another database mightre-use an existing connection internally, rather than abandon the existing connection and open a new connection.

    If you open the task manager in windows, how many instances of the back end do you see as your app runs?
    Lou
    使大吃一惊
    "Lisa, in this house, we obey the laws of thermodynamics!" - Homer Simpson
    "I have my standards. They may be low, but I have them!" - Bette Middler
    "It's a book about a Spanish guy named Manual. You should read it." - Dilbert


  5. #5
    Join Date
    Feb 2003
    Location
    Netherlands
    Posts
    63
    Hi Loquin,

    Yes Omnis is a RAD tool, but as I mentioned it has a build in database system from which we are migrating. It also has many access modules to other database systems, it really is a great tool, you might wanne check it out.

    Purely from a migration point of view, if this where any other data source I would use exports/imports but in this particular case, this is the easiest solution. I can easily go the export/import route but as we'll still be using Omnis to build a client app that connects to PostgresSQL and does inserts/updates/delete based on user actions, I want to solve whatever memory issue we have.

    The tool does not keep making new database connections. Actually a work around we're using now is that after every 1000 or so inserts, we disconnect from the database and reconnect and that clears up the memory.

    Looking at the task manager I have 4 postgres.exe processes and one pg_ctl.exe process. Once I connect to postgress I get a 5th postgres.exe connection, this is the process that keeps eating memory. Once I disconnect that 5th process disappears and so does the memory allocated to it.
    Greetz,

    Bastiaan Olij

  6. #6
    Join Date
    Jun 2004
    Location
    Arizona, USA
    Posts
    1,848
    If you begin a transaction at the start of your processing, and don't commit the transaction, this could also cause what your're seeing.

    After connecting, try explicitly beginning a transaction, updating a thousand or so records, committing the transaction, and start another transaction block, rather than disconnecting & reconnecting.

    i.e. (pseudo code)
    Code:
    Connect to Source and target databases
    Open Source Table
    
    Create Target Table
    
    Begin Transaction
    
    Do While not Source.EOF
      Increment N
      If N mod 1000 = 0 then 
        Commit Transaction
        Begin Transaction
      End If
      Read Record from Source
      Write Record to Target
    Loop
    Commit Transaction
    Last edited by loquin; 02-05-08 at 17:22.
    Lou
    使大吃一惊
    "Lisa, in this house, we obey the laws of thermodynamics!" - Homer Simpson
    "I have my standards. They may be low, but I have them!" - Bette Middler
    "It's a book about a Spanish guy named Manual. You should read it." - Dilbert


  7. #7
    Join Date
    Feb 2003
    Location
    Netherlands
    Posts
    63
    Hi Loguin,

    That does seem to make a difference, when starting from scratch I can still see the memory go up as before, which I guess is logical because it still needs to allocate memory for what its doing. But once I commit suddenly the memory usage stops, its like its reusing the already allocated memory from the last time which is exactly what I want. Probably if it would need more memory for an action at a later time, it may use more memory again, but it doesn't seem to endlessly waiste it.

    Strange behaviour though because a single action without starting a transaction should behave as a single action between a begin transaction and commit. Atleast it does on Sybase which is what I'm used to
    Greetz,

    Bastiaan Olij

  8. #8
    Join Date
    Jun 2004
    Location
    Arizona, USA
    Posts
    1,848
    I don't understand this behavior from PostgreSQL either.

    pg has an auto-commit feature which is always enabled. There is no configuration setting (as there is with SQL Server (SET Implicit_Transactions,) and possibly Sybase) which will turn it off. The ONLY way to disable auto-commit is to issue a BEGIN statement. Then, you will have to explicitly COMMIT or ROLLBACK the transaction. However, once you've ended the transaction, you are forced to issue another BEGIN; else auto-commit is automatically in place again.

    Are you SURE that Omnis Studio isn't sending a SQL BEGIN statement at (or shortly after) the connection takes place when connecting to PostgreSQL? If it uses ODBC to connect to PostgreSQL, you might consider turning on the odbc logging, so that you can check the log file to see just what SQL commands it is actually shooting out to the server...
    Lou
    使大吃一惊
    "Lisa, in this house, we obey the laws of thermodynamics!" - Homer Simpson
    "I have my standards. They may be low, but I have them!" - Bette Middler
    "It's a book about a Spanish guy named Manual. You should read it." - Dilbert


  9. #9
    Join Date
    Feb 2003
    Location
    Netherlands
    Posts
    63
    Loquin,

    I've turned statement logging on and there doesn't seem to be a transaction started from the client side. Also nothing is locked, when not starting a transaction myself any change made to the database can directly be accessed from another client while if a transaction is used and not yet committed, the changes are not yet visible for other clients.

    So it does seem to behave just like it should with the exception of not reusing memory that is allocated by previous statements.

    Anyways, starting a transaction and committing it whenever changes are made to the database is not a problem and it seems to solve my problems.
    Greetz,

    Bastiaan Olij

  10. #10
    Join Date
    Jun 2004
    Location
    Arizona, USA
    Posts
    1,848
    Unlike SQL Server (Pre-2005,) SyBase, or MySQL (non-InnoDB/Falcon) , PostgreSQL doesn't use simple record locking per se to isolate transactions. Pg maintains separate versions of the records affected by your transactions. Ref MVCC (Multi-Version Concurrency Control) Depending upon the level of transaction isolation you specify, which version of the data that will be visible to a different user (and at what time) will vary.
    Last edited by loquin; 02-07-08 at 16:27.
    Lou
    使大吃一惊
    "Lisa, in this house, we obey the laws of thermodynamics!" - Homer Simpson
    "I have my standards. They may be low, but I have them!" - Bette Middler
    "It's a book about a Spanish guy named Manual. You should read it." - Dilbert


  11. #11
    Join Date
    Feb 2003
    Location
    Netherlands
    Posts
    63
    Loguin,

    Yes I noticed that, much like Oracle does it. Point is, there is a definite difference between records that are locked, and for which the changes can't be seen by another client, and records that are committed that can be queried by another client.

    That would suggest that any inserts done without explicitly starting a transaction, are committed, making it that more strange, there is a difference in memory usage behavior
    Greetz,

    Bastiaan Olij

  12. #12
    Join Date
    Jun 2004
    Location
    Arizona, USA
    Posts
    1,848
    Quote Originally Posted by BasOlij
    ...That would suggest that any inserts done without explicitly starting a transaction, are committed, making it that more strange, there is a difference in memory usage behavior
    Exactly. I've never seen pg continuing to inflate the memory requirements for inserts, unless a transaction is in progress. Which is why I suspect that something in the development environment is beginning a transaction...
    Lou
    使大吃一惊
    "Lisa, in this house, we obey the laws of thermodynamics!" - Homer Simpson
    "I have my standards. They may be low, but I have them!" - Bette Middler
    "It's a book about a Spanish guy named Manual. You should read it." - Dilbert


  13. #13
    Join Date
    Feb 2003
    Location
    Netherlands
    Posts
    63
    loquin,

    It probably is, but since I have no access to the source code that implements the connection to PG from Omnis, I can't determine the cause. It might be as simple as forgetting to call some sort of cleanup code. Omnis has a tendency to keep statements open until you try to fetch return data, even if there is none.

    It would have to be something it does consistently wrong as the same happens using the ODBC driver, but that doesn't effect other platforms.
    Greetz,

    Bastiaan Olij

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •