Record Movies with Java Media Framework (JMF)
By Gal Ratner
While I was writing my own instant messenger using JMF (Java Media
Framework), I had to figure out solutions to many challenging
obstacles. The most difficult aspect of recording a movie from a webcam
was making sure the proper ingredients were put in the correct order.
Through my experience, I gained a higher level of mastery of JMF. If
anyone is learning or exploring JMF, this tutorial will improve his or
her working knowledge of the framework.
To begin, JMF is Sun's API for processing audio, video, and other time
-based media. This is an optional package and can be downloaded from
java.sun.com. See the "Elsewhere on the Web" sidebar for a link. In
this tutorial, we are going to learn how to utilize JMF in order to
record movies captured by a webcam connected to your computer.
Streaming Media, or time-based media, is a term used to describe any
flow of data that requires us to receive and process the data in real-
time. It is basically a steady stream of information that needs to be
addressed, processed, and presented on the fly in order to either
present it to the user or record it into a file. Processing operations
can include converting the data into a different format, compressing or
decompressing it, or merging it with other streams from other sources.
The quality of the movie or song you are streaming is a function of
several factors including bandwidth, processing efficiency of your
system, and the compression format it was transmitted in. For high
quality movies, we need more processing power and bandwidth. Quality is
determined by, but not restricted to, the number of frames displayed
each period of time.
The most common media formats are CINEPAK, which is used in AVI and
QuickTime files, and MPEG-1, which is used in MPEG files. The most
common protocol for streaming media is the real-time transport protocol
(RTP). RTP can be used over your network or the Internet. It can be
used with unicast or multicast IP addresses. When it is transmitted
over unicast, separate copies of the media are sent to each consumer.
When using multicast network architecture, make sure the data is
duplicated and sent to the consumer while the source only transmits it
once. RTP packets are not ordered and are not guaranteed. They are
being transmitted and it is the receiver's responsibility to pick them
up and place them in order to present the media. Some packets are lost
along the way. We can monitor the progress of the data using RTCP
(Real-Time Transport Control Protocol), which also provides an
identification mechanism for RTP.
There are two types of data sources available, a push data source and a
pull data source. The pull data source can be a file or a web page. To
use a pull data source, the client must initiate data transfer and
control the pull stream from the source. The push data source is any
source that broadcasts media using the real-time transport protocol
(RTP). A push data source can be your microphone or webcam.
Controls provide us with a way of monitoring and controlling the
progress of media being downloaded. JMF has standard controls built in,
and we can also define our own custom controls.
A player processes an input stream from a data source and renders it in
real-time. A player is what we are going to put between the web cam and
the screen. Since a player is responsible for processing and delivering
a constant flow of data, it has several states. In the global scheme of
things it can either be stopped or started. The steps that lead to a
player being started are:
Realizing a Player: Realizing a player involves telling the player
everything it needs to know about the data source it's going to play.
When a player is realized, it knows what resources it needs in order to
play the media. It also has visual components used to render the media
to the screen.
Pre-Fetch: Once a player is realized, we can call its pre-fetch method
in order to make it prepare to present the media. The player then
preloads the data, obtains the resources it needs, and does whatever
else it needs to prepare itself to play. When a player is done pre-
fetching, it moves into the pre-fetched state.
Start: When a player is started, it moves into a started state. Its
time-based media is mapped and its clock starts running.
A processor is a type of player. In addition to rendering media data, a
processor can output media through another data source to be used by
another player or processor. A processor takes a data source's input,
performs processing on the media, and then outputs the data. It can
send the data to another device or to a data source. It can parse media
streams, perform special effects encoding or decoding, combine multiple
tracks of data, for example video and audio, and deliver the data to a
screen or speakers.
A processor has several states that can be split into two main states:
unrealized and realized. To realize a processor you first need to
configure it. It then connects to the data source and accesses all the
information it needs in order to process the data. It then realizes
itself and moves into pre-fetching the media. At this point, we can
start processing using the processor. While the processor is in the
configured state, we can decide on our processing options using the
track control object.
*NOTE: If we call realize on an unrealized
processor, it will automatically move it through the configuring and
configured states, losing the option of getting its track controls.
The microphone can capture audio; your webcam can capture video:
therefore, they are both data sources (push data sources). Transmitting
from those sources is done using a data sink.
Data sinks read media from a data source and transmit it to other
locations such as a file, a network, or the Internet using ITP.
Next, we'll look at a JMF Example and walk through the source code.
So far, we have covered some introductory concepts. But don't worry if
you don't understand everything we've covered, yet. It will all make
sense as we now put it together.
First, we must access our webcam to see if we can recognize it using
our JMF program.
VideoFormat vidformat =
Vector devices =
Now that we've seen that we have a device installed that actually can
transmit video, we are going to create a data source from this device.
CaptureDeviceInfo di =
MediaLocator ml =
DataSource mainCamSource =
Since we want several players to access this data source, we're going
to have to turn it into a data source we can clone. We can then use the
clones to perform any presentation or transmission we would like to.
mainCamSource = Manager.createCloneableDataSource(mainCamSource);
Now that we have made our data source cloneable, we need to start
processing it in order for the clones to work. We are going to use a
class distributed by Sun to help us control the media events on our
player. The class is called camStateHelper and it implements the
controller-listener interface. It is available as a part of our
exercises source code and it's just a convenience class to help us step
through the player's states.
Next, we'll configure, realize, and then start our player. Once we have
done that, we need to use a clone in order to get its visual component,
which we'll use to display the movie in real-time on our screen.
camStateHelper playhelper = new camStateHelper(processor);
Importantly, our processor's content descriptor must be set to null in
order to prevent the processor from outputting raw data. When we are
recording data into a file, we can set it to the type we need, however,
at this point, we do not need to output our data into another data
source. Now that we have our clone up and running, we are going to use
its visual component in order to see a preview of our movie on the
At this point, we are going to create a button with an action event
that is going to invoke a third processor. This processor is going to
use a media locator and save our movie to a file. First we are going to
create a media locator to our selected file
URL movieUrl = file.toURL();
MediaLocator dest = new MediaLocator(movieUrl);
Second, we clone another data-source, and invoke our processor on it.
We are going to configure our processor, however, before we realize it,
we are going to get its track controls and set its video format to our
desired format CINEPAK. We are going to send its content descriptor to
a new file named descriptor.quicktime, and we'll set its framerate to 15.
DataSource recordCamSource = dataSource.cloneCamSource();
Processor recordProcessor = Manager.createProcessor(recordCamSource);
camStateHelper playhelper = new camStateHelper(recordProcessor);
VideoFormat vfmt = new VideoFormat(VideoFormat.CINEPAK);
Control control =
if ( control != null &&
control instanceof javax.media.control.FrameRateControl )
Once we have done this, we can realize the player, create a data sink
into our destination, open it, start the processor, and start the data
sink. Our webcam is now running. Our processor is processing the data
that is being pushed out of it, and using a media locator, our data
sink is transmitting the data into our file.
DataSink dataSink = Manager.createDataSink
In order to stop recording, we need to close our processor, and then stop and close our data sink.
Now that we have stopped recording, we can navigate to our file, open
it, and view the movie we have just created. Even though we have
stopped our third processor, it is just a clone. Our two original ones
are still up and running. In order to stop the whole process, we have
to shut down our original data source.
Fine Tuning the Processor
You can gain additional control over various qualities of your
processor by manipulating its track controls. The number of controls
available depends on the number of tracks in the overall media stream.
Once the processor has been configured, you can find out how many track
control objects are available to you by calling the getTrackControls()
method. You can then set each track control individually for format,
quality, frame rate and so on. Different types of controls can be found
in the package Javax.Media.Control. You can also enable and disable
controls as you see fit.
Setting Video Quality
When a player is configured you can set its video quality using the
following method. Get the player's controls; look through the controls
to find the quality control. Once you have found the control, you have
to make sure that the owner of the control is a CODEC. You have to get
all the formats of this control. Next, you have to find the format that
matches your video format and use the set quality method to set the
quality of this format.
Checking for Encoding
On a configured processor, you can get the track controls. Look through
the controls until you find the video format control. You can then use
a variety of methods available to you in the format class: frame rate,
size, or max. data length, for example. You can also generate a new
format out of the current one. Once you have a new format, you can try
and set your track into the new format.
Of course you can always use the popular instant messengers for
chatting and viewing video, but imagine what you can do now that you
know how to record movies from a webcam. Remember, if you can record
your own webcam using JMF, you can also record any RTP session from any
webcam on the Internet, including those popular instant messengers. To
continue your exploration of JMF, your next step might be to learn how
to create special effects in your movies.
About the example
This code was written using JDK 5.0, JMF 2.1.1e, and NetBeans. To
compile and run it you must use a 5.0 JDK and have the JMF.jar in your
classpath. Compile and run the main class: jmfexample.jmfexample.java.