Still working on finishing the track on #DataCamp. But I wanted to add a little more to this.
It took me most of a year to discover this, but I struggled mightily with data analysis functions in #Python + #Numpy + #Pandas, in #R-lang, and in #Julia-lang. #SQL was much easier to comprehend. But I've recently had a few courses where they were covering pure Python, without the data analysis packages, and that is totally different.
Even though I've barely touched Python in the past 20 years or so, it feel familiar and almost everything we do feels "natural". With the data analysis / data science content, it feels like there are dozens of nearly identically-named functions and methods, each with its own special syntax and list of arguments to pass to it.
are easily mixed up and I always (no, seriously always) pick the wrong one first.
I guess that's not a DataCamp issue, but more of a problem with the tools being covered.
But DataCamp's methods don't help with this much. Each one-hour chapter of each four-hour course is supposed to be a sequence of bite-sized tools that one learns to use and then remembers it when it comes up again later. Unfortunately, it quickly turns into a big ball of mud.
Good news #NumPy is built 1.23.2->1.24.4 with the latest #pytest and #MyPy, any other versions up to 1.26.4 could not be built without a higher level of wizardry.
What if we had something similar to the #numpy API on the #web and in #node etc? V8 is faster than CPython sure but you still can't get close to the performance of a hand-crafted SIMD number crunching loop written in C when writing #JavaScript (or using #WebAssembly for that matter). You kinda can in #Python thanks to numpy.
You may argue that most web/JS things don't need that level of performance.. but for sites which want to do e.g some pretty complex image prcoessing, SIMD would be great
Learning about #numpy comics in the documentation summit . How can we make #opensource documentation more fun amd user friendly?! Do you have ideas? #pyconus@pyOpenSci
With SQL, at least, it seems to be an artifact of the way their hands-on code runner works (Displays a short `head` of the relevant tables ... so when you're working on queries, you may not have a direct way to see whether your query does specifically what you expected and intended.)
With R-Lang, it is just that it isn't always apparent what the language will do. Some things are inexplicably backwards compared to most other languages I've seen, so mentally I tend to go with the wrong choice. Also, the practice question set is too small. I've reached the point where some of the practice exercises are familiar enough that I know which answer to choose immediately without having any understanding of why that is the correct choice.
@sirber@amin scratch the mighty #python and invariably you find some C/C++ lib that is as high performance as can get. Case in point the #lxml lib for parsing whatever you fetch from the web... https://pypi.org/project/lxml/
Where python speed will let you down performance wise is if you write "naive" code for a computationally intense task and dont use something like #numpy, #cython etc.
@_wurli if people optimized #numpy to take automatically maximum advantage of all available hardware, e.g multi-core, gpu etc, it sort of makes moot any performance concerns around #python (for a wide range of applications).
The complexity of working with #cuda, c++ bindings or inventing new languages like #mojo would be largely mitigated on a 80/20 principle.
I dont know if there is an intrinsic obstacle for this to happen or if there is some other reason...