Results 1 to 2 of 2
  1. #1
    Join Date
    Nov 2016
    Posts
    1

    Unanswered: Query Help - Duplicate Records

    Hi,

    I'm struggling to build new functionality into an existing MySQL query.

    The purpose of this function is to find duplicate entries in a table. The table is used by a data capture form attached to a Wi-Fi network. When a guest connects, they are required to enter some personal details in before internet access is granted.

    Traditionally, the portals have been configured to only capture data the first time a device connects. For subsequent connections, the device is “remembered” and is automatically authorized. For some reason I’ve found there to be a lot of duplicate data – whether that’s from users not giving enough time for the Wi-Fi system to authorize them, or for some other reason. To keep the captured data clean, this query is run. The PHP function to then delete the duplicate rows is below. Although I understand this is purely a database forum, I thought it'd be useful to post this too - the query itself however is pure MySQL.

    $queryDuplicateRows = mysql_query("SELECT COUNT(mac) as cnt, GROUP_CONCAT(id) AS ids FROM $table GROUP BY mac HAVING cnt > 1");



    function deleteDuplicateRows(){
    global $table;

    $queryDuplicateRows = mysql_query("SELECT COUNT(mac) as cnt, GROUP_CONCAT(id) AS ids FROM $table GROUP BY mac HAVING cnt > 1");

    while($row = mysql_fetch_array( $queryDuplicateRows )) {
    // Find position in string of IDs of the first comma, to denote the end of the first value
    $pos = strpos(strval($row[1]), ",");
    // Create substring, omitting the first value so that record can be retained
    $allIDsOfDuplicateMAC = substr(strval($row[1]), $pos+1);
    // Create array from the substring, using the commas as the delimiter
    $arrayOfIDsOfDuplicateMAC = explode(",", $allIDsOfDuplicateMAC);
    foreach ($arrayOfIDsOfDuplicateMAC as $i){
    mysql_query("delete from $table where id=$i");
    }
    }
    }


    Some of my customers are now asking for a record of each time a device connects. For a reconnection, the device is still “remembered” (via a search in the database based on the device MAC address), and then duplicate data is then entered into a new row into the table. Because of the query above though, this will be classed as duplicate data. What I want is to modify the function so that it only returns rows that have multiple entries within a single day. For example, below could be an example dataset returned by the above query:

    MAC Name Date
    AA:BB:CCD:EE:FF Chris 11/05/2016
    AA:BB:CCD:EE:FF Chris 12/05/2016
    AA:BB:CCD:EE:FF Chris 12/05/2016 ***
    AA:BB:CCD:EE:FF Chris 13/05/2016
    AA:BB:CCD:EE:FF Chris 14/05/2016
    AA:BB:CCD:EE:FF Chris 14/05/2016 ***
    AA:BB:CCD:EE:FF Chris 14/05/2016 ***
    AA:BB:CCD:EE:FF Chris 15/05/2016

    The rows with stars next to them would be classed as errors since there are multiple entries for that day. The output would then need to be passed to the PHP function in the same way. The data in the "date" field is just VARCHAR, and appears in the same format as the above.

    I'm now completely lost with trying to develop this query further. If anybody could provide some help/thoughts I'd really appreciate it.

    Kind regards,
    Chris.

  2. #2
    Join Date
    Oct 2007
    Posts
    161
    Provided Answers: 9
    what about something like

    select columns from table1 a
    where 1<(select count(*), date from table1 b
    where a.id = b.id
    group by date)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •