Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return grouped objects based on similar actions #16

Open
archonic opened this issue Mar 28, 2013 · 7 comments
Open

Return grouped objects based on similar actions #16

archonic opened this issue Mar 28, 2013 · 7 comments
Assignees
Labels

Comments

@archonic
Copy link

I'm liking the output of sdiff but each "segment" is a separate object. Would it be possible to merge adjacent objects with similar actions into 1 object?

In my case I'm feeding in arrays of sentences. If someone adds a paragraph, the difference is shown as a collection of new sentences. Instead, I would like one <ins> tag around the whole new paragraph.

Update: This is what I've done to accommodate for now

def consolidateDiff(sdiff)
  lastAction = ''
  sdiff.each_with_index do |diff, index|
    if diff.action == lastAction
      sdiff[index-1].old_element << diff.old_element unless sdiff[index-1].old_element.nil?
      sdiff[index-1].new_element << diff.new_element unless sdiff[index-1].new_element.nil?
      sdiff.delete_at(index)
      consolidateDiff(sdiff)
    end
    lastAction = diff.action
  end
end
@halostatue
Copy link
Owner

I'm not quite sure what you're asking for. Can you provide me a test case that shows a failing condition? It sounds intriguing.

@archonic
Copy link
Author

archonic commented Apr 1, 2013

Should have mentioned - It's a feature request, not a bug. I found it convenient to group together similar adjacent actions. This way if I do an sdiff on 2 bodies of text that are separated into sentences, a new paragraph will appear as a single "+" and not 4 "+"s, one for each sentence. Here's the context (apologies for the length!)

    # Append similar actions into 1 "change". Recursive.
    def consolidateDiff(sdiff)
      lastAction = ''
      sdiff.each_with_index do |diff, index|
        if diff.action == lastAction
          sdiff[index-1].old_element << diff.old_element unless sdiff[index-1].old_element.nil?
          sdiff[index-1].new_element << diff.new_element unless sdiff[index-1].new_element.nil?
          sdiff.delete_at(index)
          consolidateDiff(sdiff)
        end
        lastAction = diff.action
      end
    end

    def insertClass(body, styles)
      elements = %w[p ol ul h6 h5 h4 h3 h2 h1]
      elements.each do |element|
        body = body.gsub("<#{element}", "<#{element} class=\"#{styles}\"")
      end
      return body
    end

    def compare
      #versionOne is 'new', versionTwo is 'old' (typically)
      @versionOne = @section.get_version params[:d1]
      @versionTwo = @section.get_version params[:d2]
      versions = @section.send :"#{@versionOne.type.underscore.pluralize}"
      @mostRecentVersionID = versions.last.id
      @vnum = versions.size
      @nicetype = @versionOne.type.underscore.split('_').first

      seq1 = @versionOne.body.gsub(/\s/, '\0|').split('|')
      seq2 = @versionTwo.body.gsub(/\s/, '\0|').split('|')

      sdiff = Diff::LCS.sdiff(seq2, seq1)

      consolidateDiff(sdiff)

      # Output the compare all pretty-like
      diffHTML = ''
      sdiff.each do |diff|
        case diff.action
          when '='
            diffHTML << diff.new_element
          when '!'
            diffHTML << "<span class=\"diff-wrapper\">"
            diffHTML << insertClass(diff.old_element, "del") << insertClass(diff.new_element, "ins")
            diffHTML << "</span>"
          when '-'
            diffHTML << "<span class=\"diff-wrapper del\">"
            diffHTML << insertClass(diff.old_element, "del")
            diffHTML << "</span>"
          when '+'
            diffHTML << "<span class=\"diff-wrapper ins\">"
            diffHTML << insertClass(diff.new_element, "ins")
            diffHTML << "</span>"
        end
      end

      @compareBody = diffHTML.html_safe

      respond_to do |format|
        format.html
      end
    end

I figure consolidateDiff would be a useful addition to the gem. If there's a simpler way to do this, let me know. It's helpful to have adjacent similar actions as one action because I'm later going implement an "approve changes" tinyMCE plugin where an editor can approve each change between 2 revisions.

@ghost ghost assigned halostatue Apr 4, 2013
@halostatue
Copy link
Owner

Interesting. I'll have to play with this some to consider it. It sounds like a nice basis for a 1.3 release.

@archonic
Copy link
Author

I updated consolidateDiff to actually work (derp). I also updated compare to split based on words (space delimited) instead of sentences. conolidateDiff now makes space delimited comparison output manageable. I could do character comparison (forget splitting all together) but in my wiki-like diff I wouldn't want confusing mid-word insertions and it would complicate having HTML tags render in the output.

It's getting some good output, but the format of the output gets messed up when there's a change involving a word adjacent to an HTML tag. The ideal split is like this:

seq1 = "<p>Here is a paragraph. A sentence with <strong>bold text</strong>.</p><p>The second paragraph.</p>"
seq1.magic
=> ["<p>", "Here ", "is ", "a ", "paragraph. ", "A ", "sentence ", "with ", "<strong>", "bold ", "text", "</strong>", ".", "</p>", "<p>", "The ", "second ", "paragraph.", "</p>"]

So I'm making use of nokogiri but it's not very pleasant. I have to blow up everything then reconstruct an array with the html tags and their attributes intact. I'll post the module I have once I know it works.

@halostatue
Copy link
Owner

Looks interesting. I was hoping to be able to work on this for an April release, and now it looks more likely to be a June release earliest—my time is just not available for feature work and assessment.

I think you're on the right approach, but it may be possible to just compare the Nokogiri nodes as you can flatten them out. I'm not sure as I haven't tried it.

@skandragon
Copy link

I have a similar tool on https://gist.github.com/skandragon/92b1ad57e360d3948138

@archonic
Copy link
Author

@skandragon That's pretty awesome!

I've written that into a module along with an an HTML parsing class. Without running through the parser, the compare is correct, except it will split in the middle of HTML tags. The parser seems to work in small cases but for some reason, it removes tags completely after a certain point. Not sure what's up with that.

https://gist.github.com/archonic/8967057

@halostatue halostatue removed this from the 1.3 milestone Jul 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants
@halostatue @skandragon @archonic and others