Page 1 of 2 12 LastLast
Results 1 to 10 of 15

Thread: Find Duplicates, 2 Columns, Different Worksheets

  1. #1
    Junior Member
    Join Date
    Jul 2013
    Posts
    12
    Rep Power
    0

    Find Duplicates, 2 Columns, Different Worksheets

    Hello all,

    I have worksheets that contain lists of addresses. My company receives a list of new members. The problem is, sometimes these new "members" are repeat members using a different name, but the same verified address. In the past they were able to get away, until we switched to excel. The problem is that the master list keeps growing and I want to make sure that these new members aren't repeat members. As of right now, their addresses are in Column D in both sheets (sheet 1 = master). Every 2 weeks we update the list.

    The code needed:
    I open the master sheet and run the following code to copy over the new data to the master workbook. As of right now, the code copies it to the master sheet (sheet1). Can you change it so it copies it to a new worksheet and titles the worksheet by today's current date (format: mm-dd-yy)? This is my current code:
    Code:
    Sub BulkImport()
    
       Dim InFileNames As Variant
       Dim fCtr As Long
       Dim tempWkbk As Workbook
       Dim consWks As Worksheet
       Set consWks = ThisWorkbook.Sheets(1)
       InFileNames = Application.GetOpenFilename _
       (FileFilter:="Excel Files, *.*", MultiSelect:=True)
       Application.ScreenUpdating = False
       If IsArray(InFileNames) Then
          For fCtr = LBound(InFileNames) To UBound(InFileNames)
            With Workbooks.Open(Filename:=InFileNames(fCtr))
                    MsgBox "Number of cells that have data: " & Application.WorksheetFunction.CountA(.Sheets(1).Range("A2:J" & .Sheets(1).Range("A" & .Sheets(1).Rows.Count).End(xlUp).Row))
                .Sheets(1).Range("A1:J" & .Sheets(1).Range("A" & .Sheets(1).Rows.Count).End(xlUp).Row).Copy consWks.Range("A" & consWks.Rows.Count).End(xlUp)(2)
                .Close 0
            End With
          Next fCtr
       Else
          MsgBox "No file selected"
       End If
       With Application
          .StatusBar = True
          .ScreenUpdating = True
       End With
       
    End Sub
    Now that the data would be copied over, I would need to compare it with the unique list I already have in the master sheet (sheet1). It would need to compare against a list of over 10,000 (and growing), and then mark the duplicates found. I would then make sure manually that they really are duplicates (since sometimes people have the same address but a different unit number; unless you have a more efficient way to compare the two). I would delete the duplicates manually (most likely) and then have vba code copy over the new ones to the master sheet, continuing the growth of the unique list. I have a total of 10 columns (A-J).

    Also, a previous admin poster explained to me the differences between ThisWorkbook and ActiveWorkbook. Are they interchangeable here? If I swap it out, will the code run as it should? Thanks for all your help!

    I hope it's not too much!!

    Respectfully,

    Amit

  2. #2
    Administrator Admin's Avatar
    Join Date
    Mar 2011
    Posts
    1,123
    Rep Power
    10
    Hi

    Which column holds the unique key to compare ?
    Cheers !

    Excel Range to BBCode Table
    Use Social Networking Tools If You Like the Answers !

    Message to Cross Posters

    @ Home - Office 2010/2013/2016 on Win 10 (64 bit); @ Work - Office 2016 on Win 10 (64 bit)

  3. #3
    Junior Member
    Join Date
    Jul 2013
    Posts
    12
    Rep Power
    0
    It will always be Column D (which is the address column) on all sheets.

  4. #4
    Administrator Admin's Avatar
    Join Date
    Mar 2011
    Posts
    1,123
    Rep Power
    10
    Hi

    try this. adjust the sheet names in the code.

    Code:
    Option Explicit
    
    Sub kTest()
        
        Dim ka, k(), i As Long, n As Long, c As Long, d As Object
        
        Const MasterSheet   As String = "Sheet1"    '<< adjust
        Const NewData       As String = "Sheet2"    '<< adjust
        
        Set d = CreateObject("scripting.dictionary")
            d.comparemode = 1
        
        ka = ThisWorkbook.Worksheets(MasterSheet).Range("a1").CurrentRegion.Resize(, 10).Value2
        
        For i = 2 To UBound(ka, 1)
            If Len(Trim(ka(i, 4))) Then
                d.Item(Trim(ka(i, 4))) = Empty
            End If
        Next
        
        Erase ka
        ka = ThisWorkbook.Worksheets(NewData).Range("a1").CurrentRegion.Resize(, 10).Value2
        ReDim k(1 To UBound(ka, 1), 1 To UBound(ka, 2))
        
        For i = 2 To UBound(ka, 1)
            If Not d.exists(ka(i, 4)) Then
                n = n + 1
                For c = 1 To UBound(ka, 2)
                    k(n, c) = ka(i, c)
                Next
            End If
        Next
        
        If n Then
            'append new record into the master sheet
            With ThisWorkbook.Worksheets(MasterSheet)
                .Range("a" & .Rows.Count).End(xlUp).Offset(1).Resize(n, UBound(k, 2)) = k
            End With
        End If
        
    End Sub
    Cheers !

    Excel Range to BBCode Table
    Use Social Networking Tools If You Like the Answers !

    Message to Cross Posters

    @ Home - Office 2010/2013/2016 on Win 10 (64 bit); @ Work - Office 2016 on Win 10 (64 bit)

  5. #5
    Junior Member
    Join Date
    Jul 2013
    Posts
    12
    Rep Power
    0
    Hmmm.... That code seems to just copy over the data from worksheet 2 to the master sheet. Is it removing duplicates along the way? In case I was unclear in my explanation, I'll attempt to explain in an organized manner:
    1. Step 1: I receive a list of data with 65 columns and up to a 1000 rows of data. (many times a dozen or so of the rows won't have data in every column)
    2. Step 2: I open the new list received and run the following code to clean up the data and keep only the relevant fields.
      Code:
      Sub MailOrganize()
          Columns("C:C").Insert Shift:=xlToRight, CopyOrigin:=xlFormatFromLeftOrAbove
          Columns("B:B").Select
          Selection.Copy
          Columns("C:C").Select
          Selection.PasteSpecial Paste:=xlPasteValues, Operation:=xlNone, SkipBlanks _
              :=False, Transpose:=False
          Set rngTemp1 = Range("B:B")
          With Range("A1,B1,C1,D1,E1,F1,G1,H1,I1,J1").Font
              .Bold = True
          End With
          
          Range("A1").Value = "SITE"
          Range("B1").Value = "TEMP_1"
          Range("C1").Value = "TEMP_2"
          Range("D1").Value = "ADDRESS_1"
          Range("E1").Value = "UNIT"
          Range("F1").Value = "CITY"
          Range("G1").Value = "STATE"
          Range("H1").Value = "ZIP"
          Range("I1").Value = "FIRST"
          Range("J1").Value = "LAST"
          
          Set rngTemp1 = Range("B1:B1000")
      '    On Error GoTo SkipTemp1
          rngTemp1.SpecialCells(xlCellTypeBlanks).Select
          Dim l1Cell
          For Each l1Cell In Selection
          l1Cell.Value = " --- "
          Next
          
      'SkipTemp1:
          Set rngTemp2 = Range("C1:C1000")
      '    On Error GoTo SkipTemp2
          rngTemp2.SpecialCells(xlCellTypeBlanks).Select
          Dim l2Cell
          For Each l2Cell In Selection
          l2Cell.Value = " --- "
          Next
          
      'SkipTemp2:
          Set rngSite = Range("A1:A1000")
      '    On Error GoTo SkipSite
          rngSite.SpecialCells(xlCellTypeBlanks).Select
          Dim sCell
          For Each sCell In Selection
          sCell.Value = " --- "
          Next
          
      'SkipSite:
          Set rngName = Range("I1:I1000")
      '    On Error GoTo SkipName
          rngName.SpecialCells(xlCellTypeBlanks).Select
          Dim nameCell
          For Each nameCell In Selection
          nameCell.Value = " --- "
          Next
          
      'SkipName:
      
      End Sub
    3. Step 3: In case the list didn't have a complete 1 thousand entries, I would clear the contents from any "---" marks continuing past the data down to the 1000 row.
    4. Step 4: I would then close the list and open the Master workbook.
    5. Step 5: I would run the following import script:
      Code:
      Sub BulkImport()
      
         Sheets(1).Select
         Dim Srt As Worksheet
         Set Srt = ActiveSheet
         Dim InFileNames As Variant
         Dim fCtr As Long
         Dim tempWkbk As Workbook
         Dim consWks As Worksheet
         Dim szToday As String
         szToday = Format(Date, "mm-dd-yy")
         Sheets.Add
         ActiveSheet.Move After:=Srt
         ActiveSheet.Name = szToday
         Set consWks = ActiveWorkbook.Sheets(2)
         InFileNames = Application.GetOpenFilename _
         (FileFilter:="Excel Files, *.*", MultiSelect:=True)
         Application.ScreenUpdating = False
         If IsArray(InFileNames) Then
            For fCtr = LBound(InFileNames) To UBound(InFileNames)
              With Workbooks.Open(Filename:=InFileNames(fCtr))
                      MsgBox "Number of cells that have data: " & Application.WorksheetFunction.CountA(.Sheets(1).Range("A2:J" & .Sheets(1).Range("A" & .Sheets(1).Rows.Count).End(xlUp).Row))
                  .Sheets(1).Range("A1:J" & .Sheets(1).Range("A" & .Sheets(1).Rows.Count).End(xlUp).Row).Copy consWks.Range("A" & consWks.Rows.Count).End(xlUp)(1)
                  .Close 0
              End With
            Next fCtr
         Else
            MsgBox "No file selected."
         End If
         With Application
            .StatusBar = True
            .ScreenUpdating = True
         End With
         
      End Sub
    6. Step 6: I would then need to compare the new list to the list on the master sheet. I would need to find duplicates based on the members' addresses. The members' addresses will be listed in Column D. Should it find a duplicate, it would mark all the duplicates (preferably the entire row).
    7. Step 7: I would then need to remove the duplicates after assessing that they really are duplicates since sometimes clients have similar addresses but with a different unit or suite number. One way I thought of is to have the duplicates marked on the temp worksheet, and copied to a new worksheet where the entire row with the duplicate found would be a green color background while the entire row being checked against from the master sheet would be copied underneath it to the new worksheet, and would have a yellow background. Although, I'm not sure if it's attainable through the code to do.
    8. Step 8: I would then remove the duplicates from the temp worksheet.
    9. Step 9: I would copy the new unique data over to the master sheet to the bottom, continuing the growth of the sheet. Since all the headers are the same, I would not need the first row from the temp worksheet to be copied over to the master sheet.


    I really really appreciate it. I can't stress that enough. Thank you for all the help you can give.
    If there is anything else needed from me, please let me know.

    Respectfully,

    Amit

  6. #6
    Administrator Admin's Avatar
    Join Date
    Mar 2011
    Posts
    1,123
    Rep Power
    10
    Have you tried the code ? Try to run the code after importing the new list.
    Cheers !

    Excel Range to BBCode Table
    Use Social Networking Tools If You Like the Answers !

    Message to Cross Posters

    @ Home - Office 2010/2013/2016 on Win 10 (64 bit); @ Work - Office 2016 on Win 10 (64 bit)

  7. #7
    Junior Member
    Join Date
    Jul 2013
    Posts
    12
    Rep Power
    0
    Hello Admin,

    The code seems to copy data from the second worksheet to the master sheet. When scrolling through the data, there are many highlighted cells in Column C (?). Although, when searching for duplicates, there are none. So I'm guessing the code automatically removed the duplicates? If the code were to automatically remove duplicates, then I would need it to confirm that data in Column D are duplicates, as well as the data in Column E (which is the unit or apartment number) since some people may live in the same building. And since it would be automatically removing the duplicates, could it give me a msgbox stating how many duplicate rows were found at the end of the code? It is important that sheet2 not have the duplicates either since I will be needing to send a welcome mail to those new members. Also, the bulkimport macro creates a new sheet in the master workbook based on today's date. Could the code you wrote automatically look for sheet #2 instead of giving it a specific name? Bulkimport macro will make sure that the new sheet will always be sheet #2.

    Thank you very much Admin.

    Respectfully,

    Amit

  8. #8
    Administrator Admin's Avatar
    Join Date
    Mar 2011
    Posts
    1,123
    Rep Power
    10
    Hi

    OK. Try this. untested.

    Code:
    Option Explicit
    
    Sub kTest()
        
        Dim ka, k(), i As Long, n As Long, c As Long, d As Object
        Dim DupeCount   As Long, addr As String, wksNewData As Worksheet
        
        Const MasterSheet   As String = "Sheet1"    '<< adjust
        
        Set d = CreateObject("scripting.dictionary")
            d.comparemode = 1
        
        ka = ThisWorkbook.Worksheets(MasterSheet).Range("a1").CurrentRegion.Resize(, 10).Value2
        
        For i = 2 To UBound(ka, 1)
            If Len(Trim(ka(i, 4))) Then
                d.Item(Trim(ka(i, 4))) = Empty
            End If
        Next
            
        Erase ka
        Set wksNewData = ThisWorkbook.Worksheets(2)
        ka = wksNewData.Range("a1").CurrentRegion.Resize(, 10).Value2
        ReDim k(1 To UBound(ka, 1), 1 To UBound(ka, 2))
        
        For i = 2 To UBound(ka, 1)
            If Not d.exists(ka(i, 4)) Then
                n = n + 1
                For c = 1 To UBound(ka, 2)
                    k(n, c) = ka(i, c)
                Next
            Else
                DupeCount = DupeCount + 1
                addr = addr & ",D" & i
                If Len(addr) > 245 Then
                    wksNewData.Range(Mid(addr, 2)).Interior.Color = 65535
                    addr = vbNullString
                End If
            End If
        Next
        If Len(addr) > 1 Then
            wksNewData.Range(Mid(addr, 2)).Interior.Color = 65535
            addr = vbNullString
        End If
        If n Then
            'append new record into the master sheet
            With ThisWorkbook.Worksheets(MasterSheet)
                .Range("a" & .Rows.Count).End(xlUp).Offset(1).Resize(n, UBound(k, 2)) = k
            End With
        End If
        If DupeCount Then
            MsgBox "There are " & DupeCount & " duplicates.", vbInformation
        End If
        
    End Sub
    Cheers !

    Excel Range to BBCode Table
    Use Social Networking Tools If You Like the Answers !

    Message to Cross Posters

    @ Home - Office 2010/2013/2016 on Win 10 (64 bit); @ Work - Office 2016 on Win 10 (64 bit)

  9. #9
    Junior Member
    Join Date
    Jul 2013
    Posts
    12
    Rep Power
    0
    Admin,

    I tested your code. It looks great! It finds the duplicates in sheet2 and appends the unique data to the master sheet. It seems to be highlighting fields in Column C of the master sheet and I'm not sure why. Excel also doesn't let me sort by Cell Color On Top for Column C, so I'm not sure what the reason is for them being highlighted.

    Also, I'd just like to make sure that the duplicate finding process is looking for matches in both columns D and E, since it is appending the unique values automatically, I don't want to miss people's unit numbers.

    THANK YOU VERY MUCH!!!! It looks great!

    I have one other request to the appending process: can you make it so when the new data is appended, automatically in Column K a new data field is created with the value of 0 ?

    Again, thank you very much Admin. Your help is deeply appreciated.

    - Amit

    Edit: I cleared all the rules that were on the master sheet and it seems to fix the problem of the highlighting in column c, although I don't remember every putting a rule on. So that issue is done
    Last edited by ghendi; 07-13-2013 at 11:33 PM.

  10. #10
    Administrator Admin's Avatar
    Join Date
    Mar 2011
    Posts
    1,123
    Rep Power
    10
    Hi

    replace

    Code:
    .Range("a" & .Rows.Count).End(xlUp).Offset(1).Resize(n, UBound(k, 2)) = k
    with

    Code:
    .Range("a" & .Rows.Count).End(xlUp).Offset(1).Resize(n, 10) = k
    Cheers !

    Excel Range to BBCode Table
    Use Social Networking Tools If You Like the Answers !

    Message to Cross Posters

    @ Home - Office 2010/2013/2016 on Win 10 (64 bit); @ Work - Office 2016 on Win 10 (64 bit)

Similar Threads

  1. Replies: 10
    Last Post: 05-23-2013, 12:30 PM
  2. Replies: 4
    Last Post: 04-05-2013, 12:08 PM
  3. Detect Duplicates In Named Ranges With Cross Check
    By Yegarboy in forum Excel Help
    Replies: 3
    Last Post: 10-09-2012, 11:02 AM
  4. Copy Automatically Between Two Worksheets
    By marreco in forum Excel Help
    Replies: 0
    Last Post: 08-27-2012, 04:48 PM
  5. Protecting Elements in Worksheets
    By LeeL in forum Excel Help
    Replies: 1
    Last Post: 07-29-2011, 07:32 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •