Results 1 to 5 of 5
  1. #1
    Join Date
    Aug 2011

    Seeding data with security info


    Not sure where this questions should go.

    I am creating a database that will hold consumer electrical goods information.

    Washing Machine

    I have built up most of the data but am worried about how to protect the data once I make it available to customers who will be paying a license to use it. They will receive it in csv format or via online access.

    Is there a way of creating hash infromation or any alternatives that can be hidden in the data itself. I am wary of creating columns that hash things as they could just be removed. Ideally I want something that affects the important data without making it unusable.

    The data will consist mainly of numbers and textual descriptions, so could put specific miss-spellings etc into the data but that may affect is sellability.

    Has anyone got ideas about this.



  2. #2
    Join Date
    Sep 2009
    The mailing list companies add fictitious accounts that lead back to them to so that if you use the mailing list too many times, or pass it on to someone else they know about it.
    Why not host it yourself, and sell userids and passwords.

  3. #3
    Join Date
    Aug 2011
    Thanks for the reply Kitaman,

    I am not sure how it would work, seeding "wrong" information, as we are planning to provide/sell the data based on its accuracy and up-to-date-ness.

    1st suggestions - I understand what you mean about the mailing companies as they setup phone addresses or telephone numbers leading back to them. Not sure how that would work in this instance as its product information/data.

    2nd suggestion - Thats may work better in terms of having the data stored within our control. The next questions would be how to make it available to automated systems and websites that would feed from it.

    e,g, on a shopping website someone does a search for all Bosch freezers, the website would then query ours to provide that information back to the originating website. - Is that how you would see it working?

    Ideally we are looking to provide a csv format as well, which complicates things.

    Thanks for the ideas

  4. #4
    Join Date
    Nov 2004
    out on a limb
    personally I wouldn't distribute this sort of information solely in a CSV format

    I'd want to write a process which took in an encrypted format (CSV is fine), decrypts on the import process
    If I was being cute I'd consider putting something into the data stream which uniquely identifies the customer this file was sent to. whether you do that as a unique CSV to each customer or as part of the decrypt process reading a common CSV file is up to you.
    the 'something' is up to you. it could be changing the spacing or punctuation, adding additional characters to the the description. altering a specific record adding a ghost record. the problem is that although your customers may respect the licence issues their customers or web visitors may not.
    I'd rather be riding on the Tiger 800 or the Norton

  5. #5
    Join Date
    Feb 2004
    In front of the computer
    I'd suggest hosting the database yourself, then providing one or more interfaces like SOAP to allow the user or their application to execute specific queries. This offers several benefits:

    1) The user always gets the most current data
    2) You can choose to use more equitable pricing models
    3) You keep control of the data itself

    The user having instant access to the most current data is a huge benefit to them. If changes are made on your system, the new information is available to the user as soon as you push that change to production instead of waiting for the next update file to arrive.

    If you use an interactive access method, you can use different charge methods to suit your customers needs. When selling the CSV file, you have no practical control over how the data is used and how much revenue you generate based on the customer's usage. Using SOAP you can tell when the data was used, by which customer (based on login informaiton), and by which TCP/IP address. This may allow you to charge more fairly for the use of the data, and it can also show you where the interest/use lies which will let you make better decisions about how and where to invest in data quality and quantity.

    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts