What is exactly the function of Python’s Global Interpreter Lock?
Do other languages that are compiled to bytecode employ a similar mechanism?

In general, for any thread safety problem you will need to protect your internal data structures with locks.
This can be done with various levels of granularity.

  • You can use fine-grained locking, where every separate structure has its own lock.

  • You can use coarse-grained locking where one lock protects everything (the GIL approach).

There are various pros and cons of each method. Fine-grained locking allows greater parallelism – two threads can
execute in parallel when they don’t share any resources. However there is a much larger administrative overhead. For
every line of code, you may need to acquire and release several locks.

The coarse grained approach is the opposite. Two threads can’t run at the same time, but an individual thread will run faster because its not doing so much bookkeeping. Ultimately it comes down to a tradeoff between single-threaded speed and parallelism.

There have been a few attempts to remove the GIL in python, but the extra overhead for single threaded machines was generally too large. Some cases can actually be slower even on multi-processor machines
due to lock contention.

Do other languages that are compiled to bytecode employ a similar mechanism?

It varies, and it probably shouldn’t be considered a language property so much as an implementation property.
For instance, there are Python implementations such as Jython and IronPython which use the threading approach of their underlying VM, rather than a GIL approach. Additionally, the next version of Ruby looks to be moving towards introducing a GIL.

The following is from the official Python/C API Reference Manual:

The Python interpreter is not fully
thread safe. In order to support
multi-threaded Python programs,
there’s a global lock that must be
held by the current thread before it
can safely access Python objects.
Without the lock, even the simplest
operations could cause problems in a
multi-threaded program: for example,
when two threads simultaneously
increment the reference count of the
same object, the reference count could
end up being incremented only once
instead of twice.

Therefore, the rule exists that only
the thread that has acquired the
global interpreter lock may operate on
Python objects or call Python/C API
functions. In order to support
multi-threaded Python programs, the
interpreter regularly releases and
reacquires the lock — by default,
every 100 bytecode instructions (this
can be changed with
sys.setcheckinterval()). The lock is
also released and reacquired around
potentially blocking I/O operations
like reading or writing a file, so
that other threads can run while the
thread that requests the I/O is
waiting for the I/O operation to

I think it sums up the issue pretty well.

The global interpreter lock is a big mutex-type lock that protects reference counters from getting hosed. If you are writing pure python code, this all happens behind the scenes, but if you embedding Python into C, then you might have to explicitly take/release the lock.

This mechanism is not related to Python being compiled to bytecode. It’s not needed for Java. In fact, it’s not even needed for Jython (python compiled to jvm).

see also this question

Python, like perl 5, was not designed from the ground up to be thread safe. Threads were grafted on after the fact, so the global interpreter lock is used to maintain mutual exclusion to where only one thread is executing code at a given time in the bowels of the interpreter.

Individual Python threads are cooperatively multitasked by the interpreter itself by cycling the lock every so often.

Grabbing the lock yourself is needed when you are talking to Python from C when other Python threads are active to ‘opt in’ to this protocol and make sure that nothing unsafe happens behind your back.

Other systems that have a single-threaded heritage that later evolved into mulithreaded systems often have some mechanism of this sort. For instance, the Linux kernel has the “Big Kernel Lock” from its early SMP days. Gradually over time as multi-threading performance becomes an issue there is a tendency to try to break these sorts of locks up into smaller pieces or replace them with lock-free algorithms and data structures where possible to maximize throughput.

Regarding your second question, not all scripting languages use this, but it only makes them less powerful. For instance, the threads in Ruby are green and not native.

In Python, the threads are native and the GIL only prevents them from running on different cores.

In Perl, the threads are even worse. They just copy the whole interpreter, and are far from being as usable as in Python.

Maybe this article by the BDFL will help.