selling_vector property #1888

bwjustus · 2020-06-29T18:40:25Z

Adding a property in pymatgen/core/lattice.py to calculate the (1, 6) array containing the six selling scalars.

mkhorton · 2020-06-29T18:44:56Z

Hi @bwjustus, nice to see this :-)

In terms of implementation, would a method to return all 24 Selling lattices be sensible?

Ultimately, the benefit of the Selling reduction is to calculate a distance between lattices A and B, which I believe takes the form of min([dist(any_selling_lattice_A, a_selling_lattice_B) for a_selling_lattice_B in all_selling_lattices_for_B]), and for that we'd need an implementation of the dist function too (see section 4 in https://journals.iucr.org/a/issues/2019/03/00/ae5061/ae5061.pdf).

I haven't thought too deeply about implementation, so let me know your thoughts.

bwjustus · 2020-06-29T18:55:25Z

Hi @mkhorton, I haven't actually seen this paper yet, I had only seen two others on the same topic, but I will try to make an implementation of the distance function. I can also see if returning all 24 lattices would be reasonable in terms of time cost as compared to Niggli reduction.

bwjustus · 2020-06-30T14:42:16Z

I did my best to follow the paper and write an implementation for distance using the virtual Cartesian Point method they described. I had a bit of trouble following it so I'm not sure if it's functioning exactly how it should, but it seems correct. The main thing I noticed is that computing all 24 reflections is pretty costly considering all 24 are computed for each of the six virtual cartesian points as well as the original point, so there ends up being 168 vectors calculated. Would it be helpful for me to add the code to this pull request so you can take a look? Currently it returns the selling vector of the reflection that gives the minimum distance.

mkhorton · 2020-06-30T16:11:35Z

Thanks, yes adding the code would be helpful. It might be possible to vectorize this code, and calculate all 24 reflections in one with numpy, which may make it less costly. As for verification, I'll see if I can find a way to independently verify.

bwjustus · 2020-06-30T16:31:57Z

Alright, I just pushed it. Most of the body of the function is just the relevant matrices, so let me know if there's a better way to represent these so that they don't take up so much space. I'm not sure I'm familiar with vectorizing code, but if there's a certain NumPy function that would make this process faster, let me know what it is and I can try to change the implementation.

mkhorton · 2020-06-30T16:52:27Z

From a numpy perspective, the basic principle is removing for loops where possible and trying to incorporate multiple operations into a single operation, for example there's numpy.linalg.multi_dot which might help, functions like tensordot (across an appropriate axis) and einsum can be helpful.

It's not clear the optimization will be necessary however, how long does this take typically? (If you're in Jupyter, you can use %timeit to get some statistics).

bwjustus · 2020-06-30T17:17:46Z

Just calculating the selling vectors takes about 60-90 microseconds, while the whole distance function takes about 1.5-2 milliseconds. I'm familiar with einsum but I'll check out the others too to see if I can speed it up. I'm not sure if this is an issue we want to consider yet, but I was also wondering how to convert a reduced selling vector back to lattice vectors so I can return a Selling-reduced lattice. Shyam and I had looked at this paper but the basis for E3x3 given there in Table 1 only has components in the x-direction, which doesn't seem to make sense for creating a 3D cell. Let me know if you have any ideas on how to do this.

bwjustus · 2020-07-01T20:01:53Z

Another possible issue that I noticed is that all 6 virtual cartesian points that I calculated are all the same, and are all just the original, unreduced selling vector multipled by -1. Maybe I misunderstood the wording of the paper, so let me know if it seems like there's something that I'm misinterpreting. Also, I've been able to eliminate some of the for loops by using multi_dot. It doesn't seem like it's changing the runtime in any significant way right now but I'll keep getting rid of for loops where I can.

mkhorton · 2020-07-01T20:42:09Z

Ah, yes, that sounds suspect. In terms of profiling, there are line profilers available which can tell you where most of the time is being spent. Two milliseconds is in general very good, but could still be prohibitive if we wanted to do a search across the entire MP database.

Drop me an email at mkhorton@lbl.gov and I'll see if I can put you in touch with someone who could help explain the VCPs.

bwjustus · 2020-07-01T22:06:24Z

I tried replacing all the for loops with just a few lines that accomplish the same thing, and the time did not really change. I tried some code profiling and it seems as though more than half the time comes from using numpy.linalg.norm to find the distances between all 168 vcp reflections and the point representing the other lattice. I'm not sure if there's a way around this because it is part of the algorithm, but hopefully if we do get in touch with Nicholas Sauter he might have some advice.

mkhorton · 2020-07-31T19:11:47Z

Hi @bwjustus, just following up on that status of this PR, what are the next steps?

bwjustus · 2020-08-01T00:03:00Z

The distance function is implemented, so Shyam has been helping me work on finding different ways to test it and try to understand the behavior a bit better. It seems to behave as expected (more dissimilar lattices have a greater selling distance, the distance between two identical lattices is 0, etc.) but I’m still working on writing a lot of different tests to try to get a better feel for what a selling distance value actually means so we can use it effectively. I can push the new selling distance function if you’d like to take a look. I won’t have access to my computer again until Sunday afternoon but I can push it then.

mkhorton · 2020-08-01T00:08:57Z

Sounds great! Just wanted to check in that there wasn't anything that was stuck, would be happy to take a look when it's ready.

shyamd · 2020-11-23T14:57:07Z

Sorry, took me a long time to get back to this. The one thing to modify is to make it so the selling vector is a 1-D array rather than 2d. Right now, we reshape it to (1,6). Let's keep it as (6,) . Can you update the @bwjustus and then the PR will be ready to merge.

bwjustus · 2020-11-23T17:11:44Z

Sure, I'll take care of that today.

shyamd · 2020-11-24T19:54:01Z

pymatgen/core/tests/test_lattice.py

+    def test_selling_dist():
+        np.testing.assert_(selling_dist(Lattice.cubic(5), Lattice.cubic(5)) == 0)
+        hex_lattice = Lattice.hexagonal(5, 8)
+        triclinic_lattice = Lattice.from_parameters(4, 10, 11, 100, 110, 80)
+        np.testing.assert_allclose(selling_dist(hex_lattice, triclinic_lattice), 76, rtol=0.1)
+        np.testing.assert_allclose(selling_dist(Lattice.tetragonal(10, 12), Lattice.tetragonal(10.1, 11.9)), 3.7, rtol=0.1)
+        np.testing.assert_allclose(selling_dist(Lattice.cubic(5), Lattice.from_parameters(8, 10, 12, 80, 90, 95)), 115.6, rtol=0.1)
+
+    def test_selling_vector():
+        a1 = 10
+        np.testing.assert_array_equal(selling_vector(Lattice.cubic(a1)), np.array([0, 0, 0, -a1**2, -a1**2, -a1**2]))
+        a2, c2 = 5, 8
+        np.testing.assert_array_almost_equal(selling_vector(Lattice.tetragonal(a2, c2)), np.array([0, 0, 0, -a2**2, -a2**2, -c2**2]))
+        a3, b3, c3 = 4, 6, 7
+        np.testing.assert_array_almost_equal(selling_vector(Lattice.orthorhombic(a3, b3, c3)), np.array([0, 0, 0, -b3**2, -a3**2, -c3**2]))


Suggested change

def test_selling_dist():

np.testing.assert_(selling_dist(Lattice.cubic(5), Lattice.cubic(5)) == 0)

hex_lattice = Lattice.hexagonal(5, 8)

triclinic_lattice = Lattice.from_parameters(4, 10, 11, 100, 110, 80)

np.testing.assert_allclose(selling_dist(hex_lattice, triclinic_lattice), 76, rtol=0.1)

np.testing.assert_allclose(selling_dist(Lattice.tetragonal(10, 12), Lattice.tetragonal(10.1, 11.9)), 3.7, rtol=0.1)

np.testing.assert_allclose(selling_dist(Lattice.cubic(5), Lattice.from_parameters(8, 10, 12, 80, 90, 95)), 115.6, rtol=0.1)

def test_selling_vector():

a1 = 10

np.testing.assert_array_equal(selling_vector(Lattice.cubic(a1)), np.array([0, 0, 0, -a1**2, -a1**2, -a1**2]))

a2, c2 = 5, 8

np.testing.assert_array_almost_equal(selling_vector(Lattice.tetragonal(a2, c2)), np.array([0, 0, 0, -a2**2, -a2**2, -c2**2]))

a3, b3, c3 = 4, 6, 7

np.testing.assert_array_almost_equal(selling_vector(Lattice.orthorhombic(a3, b3, c3)), np.array([0, 0, 0, -b3**2, -a3**2, -c3**2]))

def test_selling_dist():

np.testing.assert_(selling_dist(Lattice.cubic(5), Lattice.cubic(5)) == 0)

hex_lattice = Lattice.hexagonal(5, 8)

triclinic_lattice = Lattice.from_parameters(4, 10, 11, 100, 110, 80)

np.testing.assert_allclose(

selling_dist(hex_lattice, triclinic_lattice), 76, rtol=0.1

)

np.testing.assert_allclose(

selling_dist(Lattice.tetragonal(10, 12), Lattice.tetragonal(10.1, 11.9)),

3.7,

rtol=0.1,

)

np.testing.assert_allclose(

selling_dist(Lattice.cubic(5), Lattice.from_parameters(8, 10, 12, 80, 90, 95)),

115.6,

rtol=0.1,

)

def test_selling_vector():

a1 = 10

np.testing.assert_array_equal(

selling_vector(Lattice.cubic(a1)),

np.array([0, 0, 0, -(a1 ** 2), -(a1 ** 2), -(a1 ** 2)]),

)

a2, c2 = 5, 8

np.testing.assert_array_almost_equal(

selling_vector(Lattice.tetragonal(a2, c2)),

np.array([0, 0, 0, -(a2 ** 2), -(a2 ** 2), -(c2 ** 2)]),

)

a3, b3, c3 = 4, 6, 7

np.testing.assert_array_almost_equal(

selling_vector(Lattice.orthorhombic(a3, b3, c3)),

np.array([0, 0, 0, -(b3 ** 2), -(a3 ** 2), -(c3 ** 2)]),

)

Clean up some of the codestyle

pymatgen/core/lattice.py

…elling

pymatgen/core/tests/test_lattice.py

The selling vector order is different on my Mac and on Github Actions linux runner. This might be a numeric instability issue.

mkhorton · 2021-04-12T18:37:29Z

Hi @bwjustus, nice PR! Can I clarify whether the test cases have been independently verified? e.g. did you use a lattice that was given in a paper where the Selling vectors/distances were known?

bwjustus · 2021-04-12T19:24:00Z

Hi @mkhorton! For reference, this is the main paper I referenced when working on this function: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6492488/. They did not give any test cases/examples for selling distance, but in the 2nd section of that paper where it discusses S6 space, they have one example of a calculated selling vector for an orthorhombic structure. The parameters for the structure were (10, 12, 20, 90, 90, 90) and the selling vector was [0, 0, 0, -100, -144, -400]. I didn't use this exact example, but in general what I found was that for orthorhombic lattices, the selling vector generally has the form [0, 0, 0, -a^2, -b^2, -c^2]. One of the tests I wrote has this exact same form, just different lengths. The other two selling vector tests I wrote are for a cubic structure and a tetragonal structure, which follow the same rules. I did not have any reference for the selling distances, so those tests were used to verify the assumptions that the selling distance between two identical structures is 0, and that as they become more dissimilar, the distance increases. Additionally, I also did some tests where I incrementally increased/decreased lattice lengths and angles to make sure the distance also increased/decreased monotonically. I didn't include these tests in the PR, but if you'd like me to add them in I can do that. If there's anything else that needs to be added / changed let me know and I'll take care of it!

mkhorton · 2021-04-12T19:26:42Z

Ok, that's perfect, thanks! I just wanted some more context on how the code was verified. No need to add additional tests for the monotonic case since I imagine that might be slow, but perhaps add a code comment to the test file just to mention that that verification was performed.

bwjustus · 2021-04-12T19:28:24Z

Sounds good. Should I say that verification was performed in the sense that I referenced it against that paper?

mkhorton · 2021-04-12T19:30:39Z

Sure, or you can link the comment you posted above. I just want the additional context to be easy to find if someone else is examining this code at a later date.

bwjustus · 2021-04-12T19:31:53Z

Alright that makes sense, I'll add that in right now.

mkhorton · 2021-04-14T20:37:35Z

Merged, thanks @bwjustus!

add selling_vector property

4aa7f66

add min_selling_dist function

8553463

bwjustus added 2 commits August 3, 2020 11:40

Fixed selling_dist function

0df3605

Add Selling Vector/Distance tests

611c9a9

bwjustus added 3 commits November 23, 2020 11:14

remove reshaping of selling vector

f789d3c

fixing selling tests

b6a1cfa

fix selling tests again

ce51b9e

shyamd reviewed Nov 24, 2020

View reviewed changes

shyamd and others added 4 commits November 24, 2020 11:56

fix some linting in main selling code

60c423f

adding class specification for selling functions in tests

9f59a5c

Merge branch 'selling' of https://github.com/bwjustus/pymatgen into s…

56f00d8

…elling

fix indentation

efbbe98

shyamd reviewed Nov 25, 2020

View reviewed changes

pymatgen/core/tests/test_lattice.py Outdated Show resolved Hide resolved

shyamd and others added 2 commits November 24, 2020 19:13

update linting in tests

301bb28

fixing errors in calling selling_vector

cc02b69

bwjustus and others added 15 commits November 25, 2020 10:09

conclude merge

51d1ea8

fix deletion

8453248

fix rounding in selling_vector tests

e7f8556

Merge branch 'mp_master' into selling

1a6aa86

fix pycodestyle issues

ded8863

fix pylint issue

dac152c

Try numpy filtering to deal with different orderings

67e6b4b

The selling vector order is different on my Mac and on Github Actions linux runner. This might be a numeric instability issue.

testing selling while loop with print statement

d9a35ee

print test 2

780cafc

test selling with tol

38cb649

change list to array

28d623e

updated selling_vector and tests

902768f

fix parentheses error

f0bf4a9

fix style

cdf5ae4

fix test style

120e016

bwjustus changed the title ~~[WIP] selling_vector property~~ selling_vector property Mar 12, 2021

mkhorton self-assigned this Apr 12, 2021

bwjustus added 2 commits April 12, 2021 12:35

verification comment

8059fe6

fix comment style

9b32009

mkhorton merged commit 009c0ba into materialsproject:master Apr 14, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

selling_vector property #1888

selling_vector property #1888

bwjustus commented Jun 29, 2020

mkhorton commented Jun 29, 2020 •

edited

bwjustus commented Jun 29, 2020

bwjustus commented Jun 30, 2020

mkhorton commented Jun 30, 2020

bwjustus commented Jun 30, 2020

mkhorton commented Jun 30, 2020

bwjustus commented Jun 30, 2020 •

edited

bwjustus commented Jul 1, 2020

mkhorton commented Jul 1, 2020

bwjustus commented Jul 1, 2020

mkhorton commented Jul 31, 2020

bwjustus commented Aug 1, 2020

mkhorton commented Aug 1, 2020

shyamd commented Nov 23, 2020

bwjustus commented Nov 23, 2020

shyamd Nov 24, 2020

shyamd Nov 24, 2020

mkhorton commented Apr 12, 2021

bwjustus commented Apr 12, 2021

mkhorton commented Apr 12, 2021

bwjustus commented Apr 12, 2021

mkhorton commented Apr 12, 2021

bwjustus commented Apr 12, 2021

mkhorton commented Apr 14, 2021

selling_vector property #1888

selling_vector property #1888

Conversation

bwjustus commented Jun 29, 2020

mkhorton commented Jun 29, 2020 • edited

bwjustus commented Jun 29, 2020

bwjustus commented Jun 30, 2020

mkhorton commented Jun 30, 2020

bwjustus commented Jun 30, 2020

mkhorton commented Jun 30, 2020

bwjustus commented Jun 30, 2020 • edited

bwjustus commented Jul 1, 2020

mkhorton commented Jul 1, 2020

bwjustus commented Jul 1, 2020

mkhorton commented Jul 31, 2020

bwjustus commented Aug 1, 2020

mkhorton commented Aug 1, 2020

shyamd commented Nov 23, 2020

bwjustus commented Nov 23, 2020

shyamd Nov 24, 2020

Choose a reason for hiding this comment

shyamd Nov 24, 2020

Choose a reason for hiding this comment

mkhorton commented Apr 12, 2021

bwjustus commented Apr 12, 2021

mkhorton commented Apr 12, 2021

bwjustus commented Apr 12, 2021

mkhorton commented Apr 12, 2021

bwjustus commented Apr 12, 2021

mkhorton commented Apr 14, 2021

mkhorton commented Jun 29, 2020 •

edited

bwjustus commented Jun 30, 2020 •

edited