Results 1 to 13 of 13
  1. #1
    Join Date
    Apr 2007
    Posts
    130

    Unanswered: How to create dedicated(private) views for different users

    Hi ALL,



    The project we're working on is a document retrieval system. Two kinds of Features for each document in a repository are extracted as described in table-1 and table-2.


    Table1
    DocID Word Freq
    1 Book 9
    1 Study 3
    2 Work 4
    2 Run 5
    2 User 2
    3 Search 6
    3 Sweat
    Table2
    DocID Word1 Word2 Freq
    1 Book Read 3
    1 Study English 2
    2 Work Hard 1
    3 Sweat warm 1


    Given an example document, which is described as
    ExTable1
    Word Freq
    Book 8
    Read 7

    ExTable2
    WordA WordB Freq
    Book Read 2
    Book Write 1

    I need to calculate the similarities between the example document and each of the document in the repository.

    For example, the similarity between Doc1 and the example document( based on ExTable1 and Table1) can be calculated as follows

    Read: 9+8= 17
    the similarity based on ExTable2 and Table3 can be calculated as follows

    Book--Read: 3+2 = 5

    Finally, calculating the final result: 17+5=22

    In my program, I implement the above operations through two steps.

    First create two views corresponding to the two-level matching. Then, based on the two views to calculate the final result.

    First Step:
    Code:
    CREATE view view1
          AS SELECT DocID, SUM(ExTable1.Freq + Table1.Freq) Score 
    FROM 
         ExTable1, Table1
    WHERE 
         ExTable1.Word = Table1.Word 
    Group BY DocID order BY Score;
    
    CREATE view view2 
         AS SELECT DocID, SUM(ExTable2.Freq + Table2.Freq) Score 
    FROM 
         ExTable2, Table2 
    WHERE 
         ExTable2.WordA = Table2.WordA AND ExTable2.WordB = Table2.WordB 
    Group BY DocID order BY Score;


    STEP 2

    Code:
    SELECT view1.DocID As DocID, view1.Score+view2.Score As Score FROM view1 LEFT JOIN view2 on view1.DocID=view2.DocID ORDER BY Score
    However, this approach is problmetic when the system receives two or more user query requests simultaneously, since view1 and view2 are shared by all users. The contents of view1 and view2 may be rewritten when another user makes his request.

    Now, I am not sure how to solve this problem. Someone suggested me to merge the above two statements for creating view1 and view2 into one statement. I am not sure whether this is feasible, and whether it is the right solution for this problem.

    Please help.
    Last edited by cy163; 10-15-08 at 01:20.

  2. #2
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    Quote Originally Posted by cy163
    The contents of view1 and view2 may be rewritten when another user makes his request.
    um, no

    views are not "rewritten" unless each user runs the CREATE VIEW statement, which you would not want to allow in the first place

    a view is nothing but a SELECT statement that has been given a name

    you should be able to solve your problem without views
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  3. #3
    Join Date
    Apr 2007
    Posts
    130
    r937, thank you for your quick reply.

    Would you please take a close look at my problem and give me some suggestions. I have been stuck on this problem for a long time.

  4. #4
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    suggestions about what? views? users? document similarities?
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  5. #5
    Join Date
    Apr 2007
    Posts
    130
    about how to calculate document similarities without using views in the above cases. In my method, the final score is generated based on two scores which in turn is calculated by using two 'create two view' statements. As you suggested, using view is not a wise way . However, I do not know how to calculate the final score without using views.
    Last edited by cy163; 10-14-08 at 11:40.

  6. #6
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    okay, i'll walk you through it

    let's start with your final query

    since i *hate* scrolling sideways to understand a query, i'll re-write it here for your convenience --
    Code:
    SELECT view1.DocID
         , view1.Score+view2.Score As TotalScore 
      FROM view1 
    LEFT OUTER
      JOIN view2 
        ON view2.DocID = view1.DocID 
    ORDER 
        BY TotalScore
    i've made a couple of minor changes, see if you can spot them

    okay, now here's how to do it without views

    instead of actual VIEWs in your FROM clause, use inline views

    inline views are also called derived tables

    they are subqueries which produce a specific tabular result (i.e. table)

    this is, incidentally, where the "structured" part of structured query language comes in

    wherever you can reference a table (or a view), you can reference a subquery

    so let's substitute your view definition queries into your final query --
    Code:
    SELECT view1.DocID
         , view1.Score+view2.Score As TotalScore 
      FROM (
           SELECT Table1.DocID
                , SUM(ExTable1.Freq + Table1.Freq) Score 
             FROM Table1
           INNER
             JOIN ExTable1 
               ON ExTable1.Word = Table1.Word 
           GROUP 
               BY Table1.DocID
           ) AS view1 
    LEFT OUTER
      JOIN (
           SELECT Table2.DocID
                , SUM(ExTable1.Freq + Table1.Freq) Score 
             FROM Table2
           INNER
             JOIN ExTable2
               ON ExTable2.WordA = Table2.WordA 
              AND ExTable2.WordB = Table2.WordB 
           GROUP 
               BY Table2.DocID
           ) AS view2 
        ON view2.DocID = view1.DocID 
    ORDER 
        BY TotalScore
    neat, eh?

    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  7. #7
    Join Date
    Apr 2007
    Posts
    130
    Thank you r937. Again, I learn a lot from you.

    In addition, I wonder if there is some methods/mechnisms in SQL that can provide a temporary type tables or something like that for programmers to store intermediate data which then can be used to produce final result using SQL statement.

    In some cases, I STRONGLY feel that I do need this kind of stuffs (intermediate tables) to facilitate database programming.

    I am new to database programming have limited knowledge. So maybe the above thinking is wrong. Programmers have to go the way of merging serveral SQL statements into complicated one to eliminate the need of creating intermediate tables. This is the only way, right.

    By the way, is temporary tables can only be created and used by database administrator. that is, programmer cannot create and use temporary tables in their program?

  8. #8
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    Quote Originally Posted by cy163
    ...I wonder if there is some methods/mechnisms in SQL that can provide a temporary type tables or something like that
    yes, the method is temporary tables

    look them up in da manual

    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  9. #9
    Join Date
    Apr 2007
    Posts
    130
    Thanks r937.

    Actually, the real picture is a bit more complicated in thatt some documents in the repository should precluded from matching with the example document.

    That is, given an example document (a query), only qualified documents in the repository are compared with it to get the similarities.

    I have difficult in implementing the process of document filtering using SQL statements.

    To filter documents, two fields 'LocationName' and 'PersonName' are added to ExTable1 and Table1.

    the ExTable1 and Table1 is changed as follows.
    Table1
    DocID Word LocationName PersonName Freq
    1 Book N N 9
    1 Study N N 3
    2 Work N N 4
    2 Run N N 5
    2 User N N 2
    3 Search N N 6
    3 Sweat N N 7
    3 London Y N 3
    3 Clinton N Y 1


    ExTable1
    Word LocationName PersonName Freq
    Book N N 8
    Read N N 7
    London Y N 2
    Clinton N Y 6


    No changes to Table2 and ExTable2.

    The process of document filtering is described below.

    Rule1, (Based on ExTable1 and Table1):
    if there are person names in the title of the example document, then those documents in the repository , which do NOT include any of the person names will be ruled out. The remaining documents will continue to be evaluated with the following rules before they become qualified.

    Rule2(Based on ExTable1 and Table1):
    A document is qualified if it contains at least one location name which appears in the example document.


    Rule3(Based on ExTable2 and Table2):
    A document is qualified if there are more than one word-pair matches between the example document and this document, by examing ExTable2 and Table2.


    That is, documents after filtering by rule1 can ruled as qualified by rule2 or rule3.

    I have no idea how to implement the doucument filtering. SHould I use a stored procedure to generate a tabular reuslt(Tabular_Result) containing ALL DocIDs of qualified documents and use it in the statemennt to generate fiew1, like
    (
    SELECT Table1.DocID
    , SUM(ExTable1.Freq + Table1.Freq) Score
    FROM Table1
    INNER
    JOIN ExTable1
    ON ExTable1.Word = Table1.Word AND Table1.DocID IN Tabular_Result
    GROUP
    BY Table1.DocID
    ) AS view1
    However, I am not sure whether a stored procedure can generate a tabular result, and whether wiser approaches exists for this purpose.

    Thanks. Really need your kind help.
    Last edited by cy163; 10-15-08 at 01:51.

  10. #10
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    what do LocationName and PersonName have to do with documents?

    and why are they Y/N fields?

    and what type of filtering did you want to do?

    the answer will be to add WHERE clauses to the subqueries
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  11. #11
    Join Date
    Apr 2007
    Posts
    130
    'Word's are extacted from a document, LocationName and PersonName, both are Boolean value(Y/N), indicate whether a particular 'Word' is a name of a place or a name of Person.

    Place name and person name in a document are important info for a document matching. Given an example document containing a person name, if a candidate document is similar to the exmaple document, it is very likely the candidate doucment contain the person name in its text title or text body. Hence there is no need to do matching work between the example document and those documents which have no the person name in its text. Similarly, documents cannot meet the rule2 or rule3 also will be ruled out.

    Based on this observation, we figure out the document filtering rule-1.

    and what type of filtering did you want to do?[
    What do you mean. I will use the 3 rules described in my original post to filter document.

    Hope I have made my idea more clear.
    Last edited by cy163; 10-15-08 at 11:23.

  12. #12
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    okay, did you understand how i incorporated your views into a single query?

    now you are saying that the introduction of LocationName and PersonName might require you to use a stored procedure to generate a tabular result?

    i don't think that will be necessary

    all you have to do is add the necessary WHERE conditions to the subquery

    in other words, the "view1" subquery is your temporary table
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  13. #13
    Join Date
    Apr 2007
    Posts
    130
    Yes I understand your meaning of adding WHERE clause to the subquery.

    However, as shown in my previous post, the 3 rules are complicted, and I have difficult in translating the 3 rules into a SLQ WHERE clause due to my limited SQL skills. So, I figure out the way of using stored procedure or function to generate a result set, and calling the procedure/function in the subquery. However, I am not sure whether this method is feasible, since I do not know

    (1) How to call a procedure/function in the subquery;
    (2) Can I make a user-defined function return a result set.

    About (2) someone told me that in MySQL user-defined function cannot return a result set http://forums.mysql.com/read.php?10,...291#msg-230291. So, i am not sure whether this way is correct or not.

    r937, Please help me.
    Last edited by cy163; 10-15-08 at 11:44.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •