The Bloggers

jamie and robin Mark and Jamie Dog
Jamie, Robin, Mark and Jamie Dog

Search


  • Popular Posts

  • Topics



    Links



    Merging Two Access Databases with Duplicated Records

    By Jamie | May 16, 2007

    When you need to merge data from two separated Microsoft Access databases which includes duplicated records (rows) your sometimes best extracting the data into Excel. Going down this route rather than an automated Access wizard. This allows you to perform additional checks that records aren’t repeated, produce reports on the differences between the data sets and most importantly, peace of mind.

    Download my Excel example for guidance.

    Exporting and Merging


    1. Extract the records from the Access database into an Excel spreadsheet. You need to do this twice, i.e. for each data set into individual spreadsheets.
    2. Next create a new Excel spreadsheet. So hence you will now have three Excel spreadsheets
      1. Exported Access Data (A) - The intended destination for all merged data / original
      2. Exported Access Data (B)  - Source
      3. Merged DataSource (C)

    3. Copy each data source (A & B) into individual worksheets so you have two sheet tabs, name them DataSourceA and DataSourceB inside the Merged DataSource (C).
    4. Take a note of the number of rows in each worksheet, minus the fields names. We will use this information later on to double check our merges have been correct.
    5. Create a fourth sheet called Combined, which will be both data sets merged with out the duplicated records.
    6. Create a third worksheet and call it DuplicatedData. Inside this sheet copy both sets of data (DataSourceA and DataSourceB), one after the other. Remember the last set of data you don’t need the field names! Check that the total number of rows equals that of the two DataSourceA and DataSourceB combined.
    7. Next we need to filter out duplicated records. Go to Data > Filter > Advance Filter. In List Range click on the square that is located above “1″ and left of the “A”. Tick the box “Unique records only” and then click OK. The number of rows should reduce and the row numbers turn blue. Select all the data and copy it into the Combined worksheet tab.

    Import

    Now depending on the size and importance of the data you can now import the data back into Access, but if you need to report the differences and check what Excel has done, follow the next steps. Remember you would need to remove all data from the original Access table to import data from the Excel spreadsheet as Access will try and enforce data integrity.

    Double Checking

    1. Inside DataSourceA and B insert an additional column next to the primary key. Click on DataSourceA worksheet and on the second row down (i.e. avoiding the field names) click on the formula bar and click on the “Fx” button. Search for “COUNTIF“. Select the range as the other DataSourceB primary column (Typically A and make sure it’s the whole column, by clicking above row 1) and the Criteria as DataSourceA primary cell. Replicate this down for each record. Repeat this process for DataSourceB.

      My two formula’s read something like…
      =COUNTIF(DataSourceB!A:A,A2)
      =COUNTIF(DataSourceA!A:A,A2)

      More information COUNTIF at…
      http://office.microsoft.com/en-us/excel/HP052090291033.aspx

      http://www.techonthenet.com/excel/formulas/countif.php

      http://www.homeandlearn.co.uk/ME/mes7p3.html

    2. Now create a fifth Worksheet called unique. This worksheet will contain only the records needed to update the Access Database, i.e. the intended merge destination.

    3. In DataSourceB click on Tools > Auto Filter. In the COUNTIF column select “0” from the drop down to filter only records that can’t be found in DataSourceA. Copy all these records to the Unique worksheet tab. Go to DataSourceA and also filter this to show records that can’t be found in DataSourceB, but just note the number of unique records.
    4. Now we are going to double check that Excel’s duplicated row filter has worked successfully by counting and matching the number rows(records) from various work sheets. Minus the field names at the top!

      DataSourceA (Number of Rows) + Unique (Number of Rows) == Combined (Number of Rows)

      So in the example I’m pretty much sure that Excel’s Duplicated Row Filter has worked because…

      DataSource (7 rows) + Unique (2 Rows) == Combined (9 Rows)

    5. Now you can import the data from worksheet Unique into your Access Database without any data integrity issues from Access.
    6. From completely the above you can fairly easily produce some reports that show unique records to each data sets, duplicated data and actual imported data. 

    Tips

     

    Technorati tags: , , , , ,

    Topics: Business Applications, Tech Overload |

  • Getting your lazy co-workers to help with the business blog
  • How to send sensitive data securely
  • Networking and Geese
  • How to move your WordPress blog and keep the traffic flowing
  • Bookmarks On Muiltple Computers
  • Comments