Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

conv func by myia #152

Open
dj163110 opened this issue Dec 21, 2018 · 3 comments
Open

conv func by myia #152

dj163110 opened this issue Dec 21, 2018 · 3 comments

Comments

@dj163110
Copy link

A very interesting project. I read the paper and try to write some convolution functions by myia, still not work. Here is my code.

#myia/prim/py_implementations.py
@register(primops.flatten_op)
def flatten_op(bottom,height,width, kernel_size,stride):
    #height, width = bottom.shape
    h_size = (height - kernel_size)//stride+1
    w_size = (width - kernel_size)//stride+1
    top = np.zeros(shape=(h_sizew_size,kernel_sizekernel_size))
    for h in range(h_size):
    for w in range(w_size):
    t=bottom[hstride:hstride+kernel_size,wstride:wstride+kernel_size]
    top[hw_size+w]=t.reshape(1,kernel_sizekernel_size)
    return top

#myia/prim/grad_implementations.py
@register_bprop(primops.conv_op)
def bprop_conv_op(x, height, width, y, z,out, dout):
    h_size = (height - y)//z+1
    w_size = (width - y)//z+1
    h = 0
    w = 0
    dx = zeros_like((height,width))
    while h < h_size:
    while w< w_size:
    db = reshape(dout[h*w_size+w], (y, y))
    dx[h:h+1, w:w+1] = dx[h:h+1, w:w+1] + db
    w = w + 1
    h = h +1
    return (dx,0,0,0,0)`

Then met the error as below:
File "/Users/kpy/src/opensrc/myia/myia/parser.py", line 420, in process_Subscript slice = self.process_node(block, node.slice) File "/Users/kpy/src/opensrc/myia/myia/parser.py", line 283, in process_node raise NotImplementedError(node) # pragma: no cover

Guess that it may be related by using slice, then I tried another method.

@register_bprop(primops.conv_op)
def bprop_conv_op(x, height, width, y, z,out, dout):
  h_size = (height - y)//z+1
  w_size = (width - y)//z+1
  h = 0
  w = 0
  dx = zeros_like((height,width))
  return (dx,0,0,0,0)

Got another error information:

File "/Users/kpy/src/opensrc/myia/myia/pipeline/resources.py", line 423, in renormalize _, context = self.infer(graph, argspec, outspec, clear=True) File "/Users/kpy/src/opensrc/myia/myia/pipeline/resources.py", line 411, in infer tracks=self.required_tracks File "/Users/kpy/src/opensrc/myia/myia/infer/graph_infer.py", line 843, in run self.run_coroutine(_check()) File "/Users/kpy/src/opensrc/myia/myia/infer/graph_infer.py", line 907, in run_coroutine raise err myia.infer.core.MyiaTypeMismatchError: ('Tuple[Int[64], Int[64]] != Array[Float[64]]', [<myia.infer.graph_infer.Reference object at 0x11c0d7c88>])

Do you have any idea what is wrong with my code?

@breuleux
Copy link
Member

Hi dj,

The indent in your code blocks got messed up when you copy/pasted it, it looks like.

The first error is due to the fact the bprop function is not treated as a primitive but parsed as Myia code, and Myia does not support this kind of slicing at the moment (it will in the future), nor does it support setting slices (this feature will be added, but with a different syntax which lets us keep Myia pure functional). So it's a syntax error basically, and Myia doesn't report these cleanly at the moment.

The second error is because zeros_like((height, width)) does not allocate a zero matrix with shape (height, width), which I assume is what you expected, it returns the tuple (0, 0). This has type Tuple[Int[64], Int[64]], but since you return it as the gradient wrt x, Myia expects something that has the same type as x, i.e. Array[Float[64]], hence the error you see.

To clarify, zeros_like takes its argument, copies it, and sets every scalar to zero, so if you give it a tuple, it'll return a tuple, if you give it an array, it will return a zeroed out array of the same size, and so on. This is useful for the AD transform. What you want would be a different op, e.g. an op called zeros as there is in numpy... although, unfortunately, there is no such op in Myia at the moment. We didn't add any array allocation ops yet. So you might want to start with that, and if you get it to work, do a PR, it'd be appreciated!

Keep in mind that this is still pre-alpha software, so some features that would be essential for a release are still missing, the error reporting is patchy, and the process to add ops is not well documented (mainly because it's probably going to change). We do plan to add a conv op, but we have yet to determine what would be the best interface for it -- you're welcome to discuss it with us and to try to come up with a prototype. We probably won't be very responsive in the next two weeks, though, because we'll all be on holiday.

@dj163110
Copy link
Author

Thanks for reply. Just begin to add some array allocation ops and read more code. Some module is confused to me.

1.Not sure what 'specialize' will modify after the typer inference.
2.Why would you transform the class type into tupple in the step of 'erase class'? Just for get the handle list of funcs in the class or other purpose?

Hmm, I think I'm still not very familiar with myia.

@breuleux
Copy link
Member

breuleux commented Jan 8, 2019

Sorry for the late reply, I was on holidays.

  1. Specialize will create a distinct copy of every graph for every type signature it can have, so that every node of every graph has a unique concrete type. Essentially, if you have two calls f(1) and f(1.0), prior to specialize, you have one graph for f. After specialize, you have two, one for an integer input, and the other for a float input, with all call sites adjusted to call the proper version. This representation is easier to deal with than a duck typed one where nodes may have a variety of types depending on the caller.

  2. It's a simplification of the representation. A point with fields x and y is just going to be represented as a tuple of two elements. After we infer all types statically, we know that we can replace every instantiation of a point by an instantiation of a tuple, and all occurrences of point.x by point[0], and so on, and it will make no difference to the computation. It makes optimization simpler, though, because we only have to deal with tuples and not all sorts of classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants