Results 1 to 6 of 6
  1. #1
    Join Date
    Feb 2010
    Posts
    40

    Unanswered: Find similar, sequential rows in a table

    hi,
    I'm trying to process a table of phone records but, in particular, looking to identify calls made to the same number in a row. For example:

    phone_calls()
    -callingnumber
    -callednumber
    -callduration
    -callID
    -calltime

    suppose a person called from the same phone (calling number) to the same number 3 times in a row (duration 3 sec, 5 sec, 1 sec) each. I'd like to be able to identify that type of pattern separate from say them calling that number every day, hour, etc.

    The phone_calls, however, can have many other calls all interweaved between these 3 calls (other people making calls).

    Basically, I know a time range after which I would not consider them sequential.

    I was thinking some way an anti-distinct (show me rows which are similar) knowing the rules are that the time between calls (calltime2 - calltime1 and calltime 3 - calltime 2) has to be say < 5 seconds.

    Any guidance?

  2. #2
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    Quote Originally Posted by dtrobert View Post
    ...
    ... calls made to the same number in a row.

    ... to the same number 3 times in a row (duration 3 sec, 5 sec, 1 sec) ...
    What do you mean by the word "in a row"?
    A row includes each one value for each column(except array column).
    So, "something muliple values in a row" may be nonsense.


    Anyway,
    please publish DDL(s) and INSERT statement(s) to populate the table(s), and the expected results from the data.
    They must be better than narrative explanations,
    to give better understandings of your requirements to others, and to avoid misunderstandings.

  3. #3
    Join Date
    Feb 2010
    Posts
    40

    Find similar, sequential rows in a table

    Okay for 'in a row', I really mean in a sequence where the calltime is only separated by a specific max value. Let me give an example (I think the insert would be self explanatory then).

    call_table()

    callingparty | calledparty | calltime | duration | callID
    --------------------------------------------------------
    4001 4002 '2013-04-22 12:00' 34 44
    4000 5000 '2013-04-22 12:01' 353 45
    4001 4002 '2013-04-22 12:02' 50 46
    4001 4002 '2013-04-23 11:41' 514 47

    In this example, I want to be able to identify that callIDs 44 and 46 were two calls made 'in a row' where the delta between the calltime was < 2 minutes apart. I would not consider, however callIDs 46 and 47 to be 'in a row' because they were a day apart. Keep in mind there can be other calls (callingparty, calledparty) dispersed in between these calls (as in callID 45).

    The output I'd like to see in SQL would be something like:

    callingparty | calledparty | sequentialcalls | startcalltime
    ------------------------------------------------------
    4001 4002 2 '2013-04-22 12:00'

    Here there were the 2 sequential calls made only with the first call starting on startcalltime.

    I'm not suggestion this problem can necessarily be solved in SQL BTW.

    select callingparty, calledparty, count(1) as sequentialcalls
    group by callingparty, calledparty
    having [here's where I get stuck, something like calltime2 - calltime1 < 1 ]


    Quote Originally Posted by tonkuma View Post
    What do you mean by the word "in a row"?
    A row includes each one value for each column(except array column).
    So, "something muliple values in a row" may be nonsense.


    Anyway,
    please publish DDL(s) and INSERT statement(s) to populate the table(s), and the expected results from the data.
    They must be better than narrative explanations,
    to give better understandings of your requirements to others, and to avoid misunderstandings.

  4. #4
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    ...
    Let me give an example (I think the insert would be self explanatory then).
    I afraid that you didn't understand why I asked "DDL(s) and INSERT statement(s) to populate the table(s)".
    Yes, it is easy to construct insert statement from the data which you provided.
    It wouldn't be required much intelligence, it must be a simple work.

    But, I need to share my time for the simple work(without any payment),
    before to test and to reply my idea(SQL query).
    So, I hope questioner to undertake such simple task.
    If you(questioner) undertook such simple task, you might get response from more persons and more quickerly.

  5. #5
    Join Date
    Feb 2010
    Posts
    40

    Find similar, sequential rows in a table

    Okay, here it is.

    create table calls (callid int, calltime timestamp, callingparty varchar(40),calledparty varchar(40) ,duration int)

    insert into calls values( 44, '2013-04-22 12:00:01', '4001', '4002', 34)
    insert into calls values( 45, '2013-04-22 12:01:01', '4000', '5000', 353)
    insert into calls values( 46, '2013-04-22 12:02:01', '4001', '4002', 50)
    insert into calls values( 47, '2013-04-23 11:41:01', '4001', '4002', 514)

  6. #6
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    This is wholly untested "air code", but try:
    Code:
    create table calls 
    (  callid       int
    ,  calltime     timestamp
    ,  callingparty varchar(40)
    ,  calledparty  varchar(40) 
    ,  duration     int
    )
    
    insert into calls values( 44, '2013-04-22 12:00:01', '4001', '4002', 34)
    insert into calls values( 45, '2013-04-22 12:01:01', '4000', '5000', 353)
    insert into calls values( 46, '2013-04-22 12:02:01', '4001', '4002', 50)
    insert into calls values( 47, '2013-04-23 11:41:01', '4001', '4002', 514)
    
    CREATE INDEX calls_calling_called
       ON calls
       (callingparty, calledparty)
    
    SELECT *
       FROM calls AS m
       WHERE 3 <= (SELECT Count(*)
          FROM calls AS z
          WHERE  z.callingparty = m.callingparty
             AND z.calledparty  = m.calledparty
             AND z.calltime
                BETWEEN m.calltime - INTERVAL '00:00:05'
                AND     m.calltime + INTERVAL '00:00:05')
    Note that this does NOT do exactly what you requested, but it DOES what telcos normally want.

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •