GhPy component is known to be a bit slow in its performance because of the JIT compiler it utilizes. However, GhPy is a powerful widget to test out scripting logic without the baggage associated with an IDE. More importantly, only seasoned programmers are familiar with setting up IDEs. The normal architectural professionals simply do not have the resources to customize a python IDE. I myself have no experience configuring my own IDE. The best I've done is using the .NET template provided by McNeel in Visual Studio, and that is in C#.
GhPy is here to stay. We do have tricks to make it go faster. Aside from writing better routines and efficient code, the most straightforward means of accelerating routine run time is to use multi-threading. To my knowledge the hardware seems actually quite ahead of the software on multi-threading. Before McNeel used GPU for their viewports, all the graphics are handled by single core on CPU. There are limits and time commitments to rewrite code adaptive to multi-processors. It isn't as simple as throwing the chunk of program to a more robust chip and it all of a sudden multi-tasks for you.
I don't pretend to be an expert and frankly there is probably only one thing I know. But that one thing is enough to pose as road block to implementing multi-threading on some of the routines. If more than one thread needs to access a single entity at the same time, the computer freaks out. There is no inherent hierarchy or sequencing of these simultaneous traffic so the entire program becomes very prone to corruption. BAD DATA!
This is actually a big issue because in the context of python, there are a lot of tasks that can be divided onto threads will eventually need to add some sort of return value to a collection of things, be that a list or a dictionary. Same holds true for other programming syntax too because the data type across languages are pretty similar.
To truly understand multi-threading I think someone with knowledge of lower level languages would ace the task pretty well. With high level languages, managing memory locations and such just sounds like nightmare to me. But there are mature libraries that can help. For python, let us introduce the threading module. Why this module? Because it's the only one that comes with IronPython 2.7 on your Grasshopper. No, multiprocessing is not a module you can call on GhPy.
The way GhPy is made within Grasshopper within Rhino may prohibit certain coding patterns to run properly but there are some super simple tasks that involve large number of repetitive computation we can multi-thread. Here is a simple snippet.
import threading, math
def Proc():
i = 0
ct = math.pow(2,20)
for n in range(int(ct)):
i+=1
thrs = []
for i in range(20):
thrs.append(threading.Thread(target = Proc))
for thr in thrs: thr.start()
For comparison, you can get rid of the threading part and simply call Proc() in the first for-loop. Run the two in a GhPy component and turn on the profiler of Grasshopper so you can see the time difference. Maybe at this point you are happy to learn of the multi-threading of a custom Grasshopper component. Seems our work is done. But here comes the twist. When I try it on a real API function call, the time savings are mediocre at best. I have a code snippet here.
from ghpythonlib.componentbase import executingcomponent as component
import ghpythonlib.treehelpers as tree
import Grasshopper as gh, GhPython as ghpy, System, Rhino as rh
import threading
class PllDiv(component):
def __init__(self):
super(PllDiv, self).__init__()
self.result = []
def DivCrv(self, c, i):
params = c.DivideByLength(5,True)
pts = [c.PointAt(t) for t in params]
self.result[i] = pts
def RunScript(self, C):
self.result = [None]*len(C)
self.Message = "threaded"
thrs = [threading.Thread(target=self.DivCrv, args = (C[i],i)) for i in range(len(C))]
for thr in thrs: thr.start()
return tree.list_to_tree(self.result)
Of course I've switched to developer mode of GhPy so I get organized. The component takes a few curves and divide them by distance along the curve. To compare it with single thread, all that needs to be done is to simply call the self.DivCrv() in the last for loop. With 300 curves, each yielding about 130 divisions, my threaded component is only about 1/6 faster than the non-threaded one.
Technically this code should be thread unsafe. Notice that locks are nowhere to be found. Whenever the function writes to the list and replace some value at an index, it's a vulnerable moment, at least that's how it happens in C# codes. When I do implement a lock object, which might be done incorrectly, the threaded component actually computes slower. I understand that setting lock objects are just too much overhead.
So multi-threading in GhPy is very peculiar. I know that ghpythonlib offers some parallel methods as well. Perhaps that's where I should experiment next. However, I set out to find the easy fix for GhPy because of its "on-the-spot" nature. The threading module hasn't given me what I was after. It may come in handy when the script is mainly agent simulation or any intense OOP. At the moment, the best tools in multi-threading is in compiling the full Grasshopper component through an IDE.
Comments