Wednesday, December 24, 2008
Merry Christmas and Happy Holidays!
Happy Holidays!
Monday, December 22, 2008
Python 3.0 and print function
There has been a lot of discussion and arguments about a change in Python 3.0 in which print is a built-in function now instead of a language-specific statement. The arguments started flowing in to the mailing list just minutes after Python 3.0 final was officially released.
Most complaints seem to revolve around the fact that *now* we have to write extra two parenthesis, ie. print("something") instead of print "something". Let us ignore that complaint for now, as it is minor (not to mention pointless) issue, something to get used to and easily solved with IDEs. Besides, we all *know* how to use functions, right? This is a non-issue.
Another (much larger) complaint is about string formatting (PEP-3101 - Advanced String Formatting), which is a very big improvement over the formatting using "%". New built-in function, format(), has been introduced along with (so-called) "Format Specification mini-language" to specify and control formatting. Quite powerful, if you ask me.
Particularly interesting formatting option is formatting of positive and negative numbers (a sign option). Consider implementing this in the old syntax:
print(format(-123456,'#> 20'))
print(format(123456, '#> 20'))
print(format(123456, '#>+20'))
Furthermore, you can implement __format__() method in your classes, which will be called by the format() built-in function for producing a formatted version of the object. However, implementing formatting specification is up to you - you can use the standard formatting syntax or implement your own.
Bottom line is, Python 3.0 brought a lot of changes and improvements; those changes have been discussed and polished over a period of several years prior to releasing the final version. What I don't understand is those people who are complaining about it now, instead of doing it before and contributing to those discussions. It will get some time to get used to the new changes, sure; but don't dismiss them straight away - give it a chance and try it out. You will soon realize that those changes are in fact good ones.
Friday, December 5, 2008
Python 3.0 released
Sadly, no official package for OS X as of yet, but I'm sure it will arrive after the weekend. I'm itching to take it for a spin on my code.. There will be some incompatibility issues, but 2.6 version helps a lot to sort that out.
Have a nice weekend, folks!
Thursday, October 2, 2008
Python 2.6 final
I'm fetching the installation now, eager to see if the installation will work with PyObjC on my Mac. Next thing on the list is to install Stackless.
Wednesday, October 1, 2008
Apple dropping NDA for iPhone Developers
In a released message, Apple stated that they have introduced the NDA to protect many of its inventions and innovations that have been put into the iPhone OS. They (finally) recognize that this has put a huge burden (and lots of problems) on the developers so they have decided to drop the whole thing.
However, this agreement is dropped only for developers who have released their software on the Apple Store, the NDA still covers the unreleased software until its release.
This will make a lot of developers much happier. Very good move by Apple!
Wednesday, September 17, 2008
World is concurrent
Stackless Python introduces the concept of microthreads, where tasklets wrap functions allowing them to be launched as microthreads. Scheduling is built in and it can be either cooperative or preemptive. Finally, there are channels which can be used for communication between tasklets. Channels block tasklets, either receiving ones or ones sending, depending on if there is a waiting receiver (or sender, respectively). Another interesting thing is that tasklets can be serialized (pickled) to a disk (or any other storage media) and deserialized later on to be resumed.
Using this functionality provided by Stackless (through a single module) is very easy and intuitive. It keeps the Python code very readable and understandable and it even improves the structure of the program. Common usage patterns are available from the authors of Stackless to help people new to concurrent programming understand principles of how Stackless is used. It allows creating custom tasklet functionality (ie. named tasklets), as well as custom channel functionality (ie. broadcast channels, sending of messages with a timeout, etc).
As I mentioned, scheduling is built in and it is up to a programmer to choose the type: preemptive or cooperative. With cooperative scheduling one has to be careful to write code so that tasklets run cooperatively. During the design and implementation of the tasklets, programmer should pay attention to run the scheduler manually if the operation within the tasklet might make other tasklets suffer for not being able to run. On the other hand, with preemptive scheduling, the scheduler itself is configured to interrupt and reschedule running tasklets. I found the cooperative scheduling more useful in my implementations since it gives more control. You can, however, using preemptive scheduling kindly ask the scheduler not to put your running code back to the scheduling queue. One useful idiom is the Sleeping Tasklets, which blocks the tasklets for a certain period of time. Interestingly, that idiom uses channels to accomplish this.
Documentation of Stackless is available on the website and it covers basic functionality. Also, it offers examples and common patterns (idioms). The module itself is very well documented and available in the python interactive shell by typing help(stackless). The community is active on the mailing list, where help is always available. Every now and then there's a good discussion about the advanced usages of Stackless, and I highly recommend subscribing to the list.
Issues exist, however. Current CPython implementation suffers from the infamous GIL (or, Global Interpreter Lock) which makes it difficult for Python to fully utilize multicore systems (almost every recent computer nowadays). For those who don't know the effects of GIL, it is a mechanism to keep multiple threads from modifying the same object at the same time. Only one thread that acquires a lock on the object may modify that object, and the interpreter controls the acquiring and releasing of the lock, especially around potentially slow operations (such as I/O).
There has been quite a lot of talk about the GIL and whether it should (or not) be removed from Python. Back in '99, few brave souls (Greg Stein and Mark Hammond) tried and removed GIL from the Python code using locks on all mutable structures. According to benchmarks, it actually slowed the execution of code. The results showed that the execution was twice as slow in single threaded environment than with the GIL. If ran on multi-CPU (multi-core) systems, there would be no actual performance gains by removing the GIL.
To truly make the code distributed between CPUs, the solution is to run several python processes and communicate tasklets between them, however that can get (very) complicated, to say the least. Luckily, Python 2.6 (next major release of Python, with final release just around the corner) comes with multiprocessing module. This module supports spawning processes in a similar fashion threading module is used. More so, since the module follows the same API as threading module, it makes refactoring of your projects which use threading a breeze. Process objects can communicate between each other using Queues or Pipes, the latter being bidirectional (both parties represent ends of the pipe, with send() and recv() methods available for sending and receiving).
Using multiprocessing module, you can even distribute your processes remotely, and on top of that, it is possible to create a pool of workers for running tasks. Finally, since the module uses subprocesses instead of threads, the effects of aforementioned GIL can be circumvented.
Stackless by design (because of the scheduler) does not allow tasklets to access (and modify) data structures at the same time. Moreover, in true concurrency fashion, it utilizes channels for passing the data between tasklets. It still has a notion of a shared state between tasklets, but of course, without the dangers. With multiprocessing on Python 2.6, Stackless Python programs will be much more scalable than they are currently, utilizing multi-cpu and multi-core environments more efficiently. That still, however remains to be seen, as porting of Stackless to Python 2.6 is a work in progress.
Enter Erlang! For those of you who don't know what it is, Erlang is a functional programming language (an entirely different approach to programming compared to, ie OOP) designed in Ericsson for the purpose of developing concurrent, distributed, fault-tolerant and (soft) real-time applications.
Being functional, it does not have a concept of a state, it has single variable assignments (just as they taught you in Math classes), dynamic typing, pattern matching of functions (with guards), etc. On top of that, it has extremely lightweight processes with no shared memory. Processes pass messages around to communicate between themselves. Erlang runtime supports very high number of concurrent processes and they are actually abstracted from the operating system. It also supports dynamic distribution of processes, both locally (over multiple CPUs or cores) or remotely (over the network to another Erlang runtime node). Thus, you can build large scale distributed applications that run on machines with many CPUs and on many machines in the network. Since there's no shared state or shared variables, all traditional problems related to concurrency simply disappear as the need for locks is removed.
Erlang does give a lot of headache to a lot of people, because of its syntax. Comparing the syntax with Python's, I can say (being biased and all) that I definitely prefer Python's. On the other hand, I quite like the syntax of Erlang, too. It looked quite bizarre at first, but going through the documentation and examples helped me understand the basics of it. I have just got delivered the Erlang book I purchased, and am very excited to learn more about Erlang. As I have always been more interested in building backend applications, using Erlang seems like a good choice worth learning more about.
As a follow up, next article on this topic will be sprinkled with some code examples, both in Stackless Python and Erlang. I really like learning about new programming languages and frameworks by actually implementing something useful using them, so I will try to do so with Erlang. Some ideas are popping into mind and in next articles I might elaborate more on that, too.
Wednesday, August 27, 2008
Django on Jython
Although not my first framework of choice for developing web applications using Python, Django has been adopted by many developers out there, even few ones defecting from Ruby on Rails. Django follows the MVC design pattern, it has url patterns similar to Routes in RoR, etc. One slight problem for developers might be the fact that Django has it's own database API - although I believe another ORM such as SqlAlchemy can be used instead.
I have been following the development of Jython very closely, mainly because I'm planning to introduce it in my company. Our software policy states that the main development language is Java and I'm trying to push the initiative that enables us to use any language as long as it runs in JVM. Nowadays there are several very good (and strong) candidates to replace Java and Jython stands IMHO as one of the strongest. It takes the strengths of Python language and brings them into JVM.
We adopted the Spring Framework in our development to help us minimize the Time to Deliver and maximize the ROI. For developing web applications we're using Spring MVC in both portlet and servlet environments. It improved both the quality of our deliveries and shortened the time to deliver. However, there is still the huge overhead of the code compilation in our day-to-day development. Something like Jython (and Django, possibly) can eliminate that problem partly (or entirely).
Django developers are going for a 1.0 release early next week. With that in mind, the fact that there's an excellent Django Book online and that it runs in Jython are more than enough reasons to get to know Django better and (even) start using it in Enterprise environment. Why not?
Thursday, July 31, 2008
Python, the evolution
The project's source code is available here, free of charge, of course. It is created by Michael Ogawa, using Processing environment. Great stuff!
Friday, July 11, 2008
Protocol Buffers
On the surface, tt reminds me a lot of CORBA, especially the way you define message structures, but it differs from it a lot in terms of message exchange and serialization - you can store and/or communicate your data structure across the network by any means, unlike CORBA where you're forced to use CORBa message brokers. In my opinion, that's the main reason why CORBA was never so widely accepted.
How does it all work? First of all, you define your structure using a DSL and store it in a .proto file. You then compile that file using a tool and create data access classes for your language of choice. Classes can be generated for C++, Java and Python - my choice would be, as always, Python. Those generated classes are then used to create, populate, serialize and retrieve your protocol buffers messages.
The messages are very flexible - you can add new fields to your messages without breaking old code that's using them; they will simply ignore it. That functionality comes in very handy, especially for larger systems (think versioning and deployment).
Yes, but what about XML, you might ask? According to Google, PBs have many advantages over XML; they are:
- simpler
- 3 to 10 times smaller
- 20 to 100 times faster
- less ambiguous
- generate data access classes that are easier to use programatically
I might disagree with the last one (think of JAXB in Java world), but I completely agree with the other ones - it's quite true that XML tends to be cumbersome and a big overkill, especially in environments where size and speed does matter.
For the end, I saved a very interesting quote from Google's PB pages:
Protocol buffers are now Google's lingua franca for data - at time of writing, there are 48,162 different message types defined in the Google code tree across 12,183 .proto files. They're used both in RPC systems and for persistent storage of data in a variety of storage systems.
Looks very interesting and very promising, considering Google is behind it. I'll follow up this article with some neat examples.
Tuesday, July 1, 2008
Back at work after a long vacation
Now, the idea - wouldn't it be great to have a feature which would, upon opening your email suite after a long vacation, populate your inbox little-by-little. Let's say you have 120 messages which have been sent to you while you were on vacation. The suite would sent you a dozen messages every 10 minutes, so you can slowly read it and catch up. Preferably over a good cup of coffee. No stress, no worries, you can read your mail at a normal rate as you would if you didn't go on vacation. I'm sure it would take some people few days to catch up, but it sure beats reading 500+ emails in one hour.
Back to reading emails.
Tuesday, April 22, 2008
Core Spring Course
I'm attending a Core Spring Training Course this week, held and organized here in Iceland by my company. I have finally managed to persuade my superiors to invest into Spring training for all the employees of our department. The course is held by Arjen Poutsma, the developer behind Spring Web Services project.
While I have been developing applications with Spring Framework for past few years, I am finding the Core Spring training course very informative and helpful. Most of the material covered so far is familiar and well-known to me, apart from the more advanced usage of Spring AOP, which I haven't used much so far. That is definitely going to change, I am sure. Every topic is followed by a very well designed lab where the newly learned material can be exercised.
Things to cover in the next two days are Spring MVC and Web Flow, configuring Spring Security, Remoting with Spring Web Services, etc. I sure hope we'll be covering (soon to be released) Web Flow 2.0 - I promised to write more about it when we start adopting it in projects, but we have not reached that point yet (one of the reasons is the fact that Web Flow haven't reached final 2.0 version).
While I prefer figuring how things work by experimenting on my own, it will be great to get information about Web Flow first-hand through the presentations and lab work on the course. Which brings me to the point -the real advantage and real value of this course is the chance to talk in person to people behind the Spring Framework, hear their opinion on things and get some good tips and advices (especially advices on AOP advices). Labs are in fact designed with that in mind.
I will write more about Spring AOP and Web Flow in the following days, so stick around.
Friday, February 22, 2008
Spring MVC easy way
Wednesday, February 20, 2008
Jython development gaining momentum
Monday, February 4, 2008
Technical evangelist?!
- Evangelism is the verbal proclaiming of the Christian Gospel or, by extension, any other form of preaching or proselytizing.
- "To announce the good news", one who preaches the facts of the Gospel in order to win converts.
- The traditional view of the evangelist is a bearer of the "Good News", proclaiming the gospel to the unbelieving world.