Add option to retain face data #74

ComFreek · 2018-08-03T07:19:32Z

(Follow-up from #73.)

Naming comments

I decided to choose consume_faces over collect_faces to remain consistent in the naming scheme already used for consume_vertices and other methods.

Tests fail!

Strangely enough, running the tests with nosetests will fail in test_parsers.py/class TestParsers. It seems that consume_faces doesn't get propagated to the Material objects.
However, running the class TestParsers in isolation with noosetests .\test\test_parser.py:TestParsers works.

I also tried cache=False, but that didn't change anything.

einarf · 2018-08-03T09:57:15Z

README.md

@@ -36,6 +36,7 @@ A more complex example
 * `strict` (Default: `False`) will raise an exception if unsupported features are found in the obj or mtl file
 * `encoding` (Default: `utf-8`) of the obj and mtl file(s)
 * `create_materials` (Default: `False`) will create materials if mtl file is missing or obj file references non-existing materials
+* `consume_faces` (Default: `False`) will consume a face list for every material as well. Beware that it will greatly increase memory usage for large models.


I'm not sure if "consume" is an understandable word to use here. If I didn't know the details of the parser I would be confused about this option.

Personally I liked "collect" much better. "consuming" is just a parser term of processing data and not necessarily related to what we do with the data itself.

Sure, renamed it back.

einarf · 2018-08-03T10:42:15Z

pywavefront/obj.py

@@ -336,6 +352,9 @@ def consume_faces(self):
            # Do we need to triangulate? Each line may contain a varying amount of elements
            triangulate = (len(self.values) - 1) > 3

+            if consumed_faces is not None:
+                consumed_faces.append([None] * (len(self.values) - 1))


Personally I would collect the face list as triangles only. Then we know each face is 3 elements in the face list. Right now for some models it would be impossible to understand the face list as some f statements can contain a variable amount of indices for each f statement (3 or more). Ensure you append inside the triangulate check as well.

I would also keep it simple and just append() to consumed_faces instead of pre-allocate space like this. Python lists over-allocate internally so memory allocations will be less and less frequent as the list increases in size. It makes more sense if you pre-allocate a lot of indices with list.extend.

I wish there was a way to use generators to collect this data (faster!), but since we also support python 2 we don't have yield from to do the magic.

Right now for some models it would be impossible to understand the face list as some f statements can contain a variable amount of indices for each f statement (3 or more).

Yes, this was actually intended. I am not sure how much preprocessing this library is supposed to do. Maybe someone (like me) just wants to access the raw face data from a Wavefront file. Especially, I could not find any other Python Wavefront parsing library (!).

In #73 it sounded a lot that this library is mainly targeted at "preparing interleaved vertex data". I described my use case in this comment at #73. I might be lucky that my Wavefront file only has triangle faces anyway, but if it had quad faces, I would have needed to roll my own library and thus duplicating a lot of raw Wavefront parsing. This seems unnecessary to me this project could distinguish between "raw parsing" and "preprocessing".

I would also keep it simple and just append() to consumed_faces instead of pre-allocate space like this.

Note that I don't preallocate the overall face list, but I preallocate the list of vertex indices of each and every face, i.e. I preallocate lists of 3 elements most of the time for my Wavefront files. So no amortization effects apply. But sure, one might call this "premature optimization". I'd argue in favor of keeping it because it is a very trivial line and also not decreasing readability/maintainbility.

einarf

If you need any help, don't be afraid to ask.

einarf · 2018-08-03T10:54:33Z

pywavefront/obj.py

@@ -336,6 +352,9 @@ def consume_faces(self):
            # Do we need to triangulate? Each line may contain a varying amount of elements
            triangulate = (len(self.values) - 1) > 3

+            if collected_faces is not None:
+                collected_faces.append([None] * (len(self.values) - 1))


Paste from outdated review:

Personally I would collect the face list as triangles only. Then we know each face is 3 elements in the face list. Right now for some models it would be impossible to understand the face list as some f statements can contain a variable amount of indices for each f statement (3 or more). Ensure you append inside the triangulate check as well.

I would also keep it simple and just append() to collected_faces instead of pre-allocate space like this. Python lists over-allocate internally so memory allocations will be less and less frequent as the list increases in size. It makes more sense if you pre-allocate a lot of indices w 8000 ith list.extend.

Since we also support python 2 here we don't have access to yield from magic.

1 2 3 # 1 triangle, 3 indices 3 4 5 6 # 2 triangles, 6 indices 7 8 9 10 11 # 3 triangles, 9 indices

Thanks for the example! Did you miss my previous comment by any chance (link)? What do you think?

Ah yes I missed it.

I think we should at least triangulate when collecting faces so new features can possibly be built on top of that.

einarf · 2018-08-03T11:15:41Z

test/test_parser.py

@@ -12,7 +12,7 @@ def prepend_dir(file):
 class TestParsers(unittest.TestCase):
    def setUp(self):
        # Append current path to locate files
-        meshes = pywavefront.Wavefront(prepend_dir('simple.obj'))
+        meshes = pywavefront.Wavefront(prepend_dir('simple.obj'), collect_faces=True, cache=False)


I would make a separate test class for collect_faces. This way we test the parser with and without the new feature.

class TestParserCollectFaces(TestParsers): # Override setup def setUp(self): meshes = pywavefront.Wavefront(prepend_dir('simple.obj'), collect_faces=True) self.mesh1 = meshes.mesh_list[0] self.mesh2 = meshes.mesh_list[1] # ... add your own test methods here ...

The reason you are getting errors is that other tests also extend TestParsers overriding the setUp methods, so they sabotage they tests you added in TestParsers. That is why you need to make your own class for testing the feature 😄

Alright! Will do.

ComFreek · 2018-08-07T17:40:36Z

@einarf

The CI tests are failing because I used yield from syntax only available in Python 3.3+. Is this OK?

Separated my tests. I deliberately avoided the inheritance from TestParsers because I added a new Wavefront file for testing.
Implemented triangulating. Is the order of the triangulated faces OK with you? I applied the same ordering as the vertex emitting, see the code in obj.py, line 417.

BTW, I removed the triangulate check, having the current index i is enough information and triangulate was redundant if I understood the previous code correctly.
Documentation for README and obj.py (i.e. consume_faces docstring) need to be updated. I wanted to wait for your opinion on 2 before I do that.

einarf · 2018-08-07T18:19:53Z

I personally prefer a project these days to be python3 only, but the project description states this is a python 2 and 3 library. This one is up to @greenmoss. I have avoided using yield from in the past to preserve this, but it's somewhat a torture for me to accept 😄
Faces should be emitted in the same order as vertices, so this is correct 👍

The rest looks reasonable.

greenmoss · 2018-08-08T01:32:31Z

Are there any metrics on python2 vs python3 support in other projects? What is the "best practice"?

Trivia: while investigating this question, I found https://pythonclock.org

LGTM as well. Ready to merge, pending either a passing test, or a compelling reason to drop Python 2 support.

I would suggest: Release a new major version: 1.0.0 that is python 3 only, update README and tests.
Maybe also create a 0.4.x branch here just in case the python 2 version need tweaks.

We can simplify a lot of things in the current code base with py3. Having yield from alone simplifies things so much. The code could need a quick cleanup round as well removing unnecessary stuff.

@greenmoss Are there any statistics on PyPI to see how many download of this package originated from a Python 2 installation?

Finished documenting the triangulation. (I noticed Basic Sphinx documentation #66 would indeed be nice to have.)
Added a preliminary "Drop Python 2" commit to see how CircleCI behaves - indeed the tests fail for Python 3.4 due to str/bytes.

I think dropping Python 2 support should rather be a separate PR, seems to "big" for one self-contained PR.

Is there any reason why the README states only 3.6+ support instead of, say, 3.5+? I picked the lowest 3.x version number, i.e. 3.4, for CircleCI now. (My actual use case for this library sadly still depends on 3.5.)

I guess the python version info in readme is a bit confusing. Feel free to tweak.

Dropping py2 in a separate PR make sense I guess.

@greenmoss Is there any consensus on how to continue from here?

@einarf I am not sure if rewriting this PR to Python 2 is worthwhile if the project will switch to Python 3 anyway.

This is up to @greenmoss really.

Option 1:

Make a 4.x branch here before merging this and release 4.2. Keep the 4.2 branch in case python 2 fixes are needed. (git checkout -b 0.4.x, git push origin 0.4.x, git checkout master)
Merge this PR and release a 1.0.0 release that is python 3 only (also updating readme)

Then we could move ahead and strip away the python 2 stuff. It doesn't create any issues as it is now. py2 is EOL soon and most people have moved to python 3 a long time ago.

Option 2:

Stay with py2/3 for now

Honestly when I think of it, with the activity level of this project and the fact that it's fairly small I would personally pick option 1. It's a driver to make more things happen in the project. Being stuck in the python 2/3 shared world is not fun. Doing the recent parser revamp was not a fun experience for me.

Just make sure to build releases with python3 for the 1.0.0 releases and don't make them universal. That means pip under python2 will pick 4.x releases.

Option 1 is fine for me. I'll see if I can get a release out this evening.

Release 0.4.2 is out, and branch python2 is created for legacy support. Feel free to tear out any Python 2-isms that are in your way, and we'll move to Python 3 on master.

@greenmoss There should be no py2 stuff standing in the way. The main thing is to update circleci to use py3 for tests and re-run tests for this PR + resolve the conflict in setup and merge.

@einarf interesting I see .circleci/config.yml on this branch updated to python:3.4, but it's still failing tests. Are there other updates needed as well? Or does it need to be something like python:3.7?

#76 should fix the python 3.4 issue. Rebase (or merge in a fresh master) this PR after merge and we're good for python 3.4 👍

Thanks @einarf ; merged #76

ComFreek mentioned this pull request

Aug 17, 2018

Switch to Python 3.4+ #77

Merged

ComFreek closed this

Aug 17, 2018

ComFreek reopened this

Aug 17, 2018

ComFreek closed this

Aug 17, 2018

ComFreek added 3 commits

August 17, 2018 15:08


          Use Python 3.4 on CI server

cd163c0


          Add option to retain face data

f99c25e


          Merge remote-tracking branch 'origin/switch-to-python3'

191dcda

ComFreek reopened this

Aug 17, 2018

Finally the tests are passing! @greenmoss This PR is ready to merge 😄

@ComFreek there are two un-checked check boxes in the comments above:

Documentation for README and obj.py (i.e. consume_faces docstring) need to be updated. I wanted to wait for your opinion on 2 before I do that.
Added a preliminary "Drop Python 2" commit to see how CircleCI behaves - indeed the tests fail for Python 3.4 due to str/bytes.

Are these still pending, or do you want me to merge as is?

greenmoss merged commit 70e601b into pywavefront:master

Aug 17, 2018

@greenmoss I ticked them both now. The first comment has been addressed in my latest commit in this PR. The second comment is obsolete now that @einarf fixed that issue in the other PR and we dropped Python 2 support in the master branch anyway.

Thanks for merging!

Yeah, saw that and merged. Thanks!

einarf reviewed Aug 3, 2018

View reviewed changes

Add option to retain face data #74

Add option to retain face data #74

Uh oh!

Conversation

Uh oh!

Naming comments

Tests fail!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding 8000 this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!