If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > PostgreSQL > PostgreSQL sourcecode

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 03-14-11, 20:05
aaronenabs aaronenabs is offline
Registered User
 
Join Date: Mar 2011
Posts: 6
PostgreSQL sourcecode

Hi there,

My name is Aaron and i am new to this forum.
I am after a couple of things but would like to take it step by step.
Firstly i have never used PostgreSQL which complicates matters, i have
been assigned a project which involves me investigating the deletion process within PostgreSQL.

I would like to ask how do i work around this, Basically from my research i have found out that PostgreSQL does not securely delete data from its tables, rows, columns or database. Instead it is deleted from the user interface and kept hidden from the user till it is overwritten by bigger bytes of data.

What i would like to achieve or do is to delete data and locate it using the source code or any other means as when data is deleted it still remains in parts of the DBMS.

Please if anyone knows or understands what am talking please can you advise on how i can achieve this. I have not installed any version of PostgreSQL so does not really matter which version i would use till i get a way to carry out this investigation process.

Cheers
Aaron
Reply With Quote
  #2 (permalink)  
Old 03-15-11, 12:27
futurity futurity is offline
Registered User
 
Join Date: May 2008
Posts: 270
Quote:
Originally Posted by aaronenabs View Post
Basically from my research i have found out that PostgreSQL does not securely delete data from its tables, rows, columns or database. Instead it is deleted from the user interface and kept hidden from the user till it is overwritten by bigger bytes of data.
And where, exactly, did you discover this? If true, it would be sort of contrary to the whole point of having a database in the first place.

Quote:
What i would like to achieve or do is to delete data and locate it using the source code or any other means as when data is deleted it still remains in parts of the DBMS.
The source code is, thankfully, freely available from their website.
Reply With Quote
  #3 (permalink)  
Old 03-15-11, 16:21
rski rski is offline
Registered User
 
Join Date: Nov 2006
Posts: 82
Quote:
Originally Posted by aaronenabs View Post
Basically from my research i have found out that PostgreSQL does not securely delete data from its tables, rows, columns or database. Instead it is deleted from the user interface and kept hidden from the user till it is overwritten by bigger bytes of data.
Well it is not like that. To acheive ACID rules postgres use MVCC for keeping different versions of the same row (uncomitted a row changes are visible only in the transaction which made a change).
When you delete a row (and commit that delete) in fact it is not aumomaticaly remove from a disk, it is only tagged as deleted. But then postgres use vacuum tool to remove such tagged rows.
In postgres sources in a file include/utils/tqual.h there is a macro HeapTupleSatisfiesVisibility which is responsible for not showing, already tagged as deleted rows. If you modify that macro to return always true then you will see all not vacuumed data also these already tagged as deleted. But I don'y why would want to do that.
Reply With Quote
  #4 (permalink)  
Old 03-15-11, 16:39
futurity futurity is offline
Registered User
 
Join Date: May 2008
Posts: 270
It occurs to me that the sarcasm intended in my original post may not have been obvious.

To expand on rski's point, this isn't just "hiding data from the user interface", but inherent to how PostgreSQL manages table data, maintains ACID compliance, and allows concurrent access to data. It's the very opposite of "not secure", whatever you may mean by that.

More information's available here.
Reply With Quote
  #5 (permalink)  
Old 03-15-11, 16:42
rski rski is offline
Registered User
 
Join Date: Nov 2006
Posts: 82
You are right I did not notice your sarcasm.
Reply With Quote
  #6 (permalink)  
Old 03-15-11, 16:48
futurity futurity is offline
Registered User
 
Join Date: May 2008
Posts: 270
That's what I get for trying to be too clever on the interwebs.
Reply With Quote
  #7 (permalink)  
Old 03-15-11, 16:48
aaronenabs aaronenabs is offline
Registered User
 
Join Date: Mar 2011
Posts: 6
Rski

thanks for that so from my understanding of what you said,

the row is only tagged as deleted and then is actually deleted using the vaccum command? and if the vaccum command has not been run this (deleted)tagged data can be accessed.

Like i said this is to do with a research carried out by patrick stahlberg on threats to privacy in database systems. He states that databases do not sucurely delete data.

thanks
Reply With Quote
  #8 (permalink)  
Old 03-15-11, 16:53
rski rski is offline
Registered User
 
Join Date: Nov 2006
Posts: 82
But I think some other databases work that way because of preformance issues. If every delete statement should trigger hard disk operations (if you want 'securely' delete data from database you have to delete some data from hard disk files) then database performance would be very poor.
Reply With Quote
  #9 (permalink)  
Old 03-15-11, 16:59
aaronenabs aaronenabs is offline
Registered User
 
Join Date: Mar 2011
Posts: 6
Well from the research paper it states
that PostgreSQL was checked on these levels,

delete physically overwrites "No"
delete creates free space "No"

so i am trying to repeat this experiment but would need to get into the source code and gain an understanding into how everything works, or most things.

Thanks
Reply With Quote
  #10 (permalink)  
Old 03-15-11, 17:18
futurity futurity is offline
Registered User
 
Join Date: May 2008
Posts: 270
Quote:
Originally Posted by aaronenabs View Post
He states that databases do not sucurely delete data.
It's no less secure than deleting anything else off a hard disk: operating systems generally delete files by simply marking the space they occupy as being "free", and for all intensive purposes, they cease to exist. However, the data itself remains on the disk until physically overwritten. The same principle applies here. Sure, there are ways you can retrieve unvacuumed rows in PostgreSQL (just as you can recover "deleted" data off of a disk), but it's not a simple matter of issuing a SELECT command. And regardless, that data would continue to be present on the physical drive even after it's been "vacuumed" by PostgreSQL (unless it's been physically overwritten).

Quote:
delete physically overwrites "No"
delete creates free space "No"
If you read the link I posted, you'll see that these are conscious design decisions that are explained in the documentation.

If you're looking for disk security, then encrypt the drive and don't let malicious people have access to it.
Reply With Quote
  #11 (permalink)  
Old 03-15-11, 17:48
aaronenabs aaronenabs is offline
Registered User
 
Join Date: Mar 2011
Posts: 6
thanks futurity

Yes its kind of the same as a harddisk, Well the document i read was just trying to start that there might be privacy issues within databases and pointed out how databases tag rows as deleted but do not actually delete the figures.

Quote:
Sure, there are ways you can retrieve unvacuumed rows in PostgreSQL (just as you can recover "deleted" data off of a disk), but it's not a simple matter of issuing a SELECT command.
This there anyway you can advise me on how to achieve this or retrieve unvacummed rows, or documents thats would help me. You have really assisted me and am grateful.
Thanks
Reply With Quote
  #12 (permalink)  
Old 03-15-11, 20:30
aaronenabs aaronenabs is offline
Registered User
 
Join Date: Mar 2011
Posts: 6
Hi

I have been looking through the expriement papers and it says thatPostgreSql keeps 100% of its expired records in the DB-slack (database slack) and its trend line is superimposed on that of the expired record.

I guess what am trying to do is to input a couple of records delete this records and try to get into the db-slack before carrying out a vacuum to see if i can retrieve the deleted data.

Please can anyone advise me how to get into the db-slack or documents advising on how to locate it.

thanks
Reply With Quote
  #13 (permalink)  
Old 03-16-11, 10:27
futurity futurity is offline
Registered User
 
Join Date: May 2008
Posts: 270
Do a Google search for accessing/retrieving/restoring deleted rows. The PostgreSQL mailing lists -- where many of the developers hang out -- will probably be your most useful resource.
Reply With Quote
  #14 (permalink)  
Old 03-16-11, 11:21
artacus72 artacus72 is offline
Registered User
 
Join Date: Aug 2009
Location: Olympia, WA
Posts: 337
Some pg contrib modules may come in handy here.
Specifically pageinspect which allows you to load a raw page of data. From which you can access the raw data for unvacuumed deleted rows.

And pgstattuple which will give you information about how many pages in the table and percentage of dead rows.

But whats important here is that deleted data is certainly no more vulnerable than a live data. While it may persist for a while, depending on your db settings, a dead row is every bit a secure as a live one... and much harder to get to.
Reply With Quote
  #15 (permalink)  
Old 03-16-11, 16:25
aaronenabs aaronenabs is offline
Registered User
 
Join Date: Mar 2011
Posts: 6
Hi guys,

i have done alittle bit of reserach into the matter and i found out A "easy" way to do which i want to run by urselfs.

"(1) shutdown your database and backup your data;
(2) change HeapTupleSatisfiesVisibility(),
just let it return "true", which means, it will treat everything as visible,
including deleted rows; compile the kernel;
(3) restart your database find out the data you want - you may select them into another table;
(4) revert the changes, and restart your database"

what do you suggest with this approach. My problem is i am not farmiliar with postgresql so i wont even know where to look within the source code to try and retrieve these dead rows.
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On