Object copy
From Wikipedia, the free encyclopedia
One of the most common procedures that occurs in computer programs is the copying of data. An object is a composite data type in object-oriented programming languages. Object copy thus describes the action wherein an object has its attributes copied to another object of the same data type. An object may be copied in order to reuse all or part of its data in a new context.
Contents |
[edit] Deep vs. Shallow vs. Lazy copy
The design goal of most objects is to give the semblance of being made out of one monolithic block even though most are not. As objects are made up of several different parts, copying becomes non trivial. Several strategies exist to attack this problem.
Consider two objects, A and B, which each refer to two memory blocks xi and yi. Think of A and B as strings and of xi and yi as the characters they contain.
The following paragraphs explain different strategies for copying A into B.
[edit] Shallow copy
One of them is the shallow copy. In this process B is attached to the same memory block as A.
This results in a situation in which some data is shared between A and B, thus modifying the one will alter the other. The original memory block of B is now no longer referred to from anywhere. If the language does not have automatic garbage collection the original memory block of B has probably been leaked.
The advantage of shallow copies is that their execution speed is fast and does not depend on the size of the data.
Bitwise copies of objects which are not made up of a monolithic block are shallow copies.
[edit] Deep copy
An alternative are deep copies. Here the data is actually copied over.
The result is different from the result a shallow copy gives. The advantage is that A and B do not depend on each other but at the cost of a slower more expensive copy.
[edit] Lazy copy
A lazy copy is a combination of both strategies above. When initially copying an object, a (fast) shallow copy is used. A counter is also used to track how many objects share the data. When the program wants to modify an object, it can determine if the data is shared (by examining the counter) and can do a deep copy if necessary.
Lazy copy looks to the outside just as a deep copy but takes advantage of the speed of a shallow copy whenever possible. The downside are rather high but constant base costs because of the counter. Also in certain situation circular references can also cause problems.
Lazy copy is related to copy-on-write.
[edit] How to copy objects
Nearly all object orientated programming languages provide some way to copy objects. As most objects are not provided by the languages itself the programmer has to define how an object should be copied, just as he has to define if two objects are identical or even comparable in the first place. Many languages provide some default behavior.
How copying is solved varies from language to language and what concept of an object it has. The following presents examples for two of the most widely used object orientated languages, C++ and Java, which should cover nearly every way how an object orientated language can attack this problem.
[edit] Copying in C++
In C++ user defined objects try to behave just as builtin ones, this implies that it must be possible to construct an object based on the model of another object. A special languages construct, the copy constructor is provided to handle the problem.
class Vector{ public: Vector(); Vector(const Vector&other): x(other.x), y(other.y), z(other.z){ } private: int x, y, z; }; Vector a; Vector b(a); // b is a copy of a
Note that if the programmer does not provide a copy constructor the compiler will generate a default one. In certain situations it is also useful to disallow copying altogether.
[edit] Another example of Deep & Shallow Copying in C++
//DEEP & SHALLO copy concept example: #include<iostream.h> class base { public: int i; base() { i=0; } base(int j) { i=j; } }; main() { base *p1=new base(23); base *p2; //Shallow copy p2=p1; cout<<"\naddress of P1:"<<p1; cout<<"\nvalue at p1:"<<p1->i; cout<<"\naddress of P2:"<<p2; cout<<"\nvalue at p2:"<<p2->i; delete p2; cout<<"\nvalue in P2 after delete:"<<p1; cout<<"\nvalue at P2 after delete:"<<p2->i; //DEEP copy base o1(67); base o2; o2=o1; cout<<"\nvalue in i:"<<o1.i; cout<<"\nvalue in i after copy:"<<o2.i<<endl; return 0; }
[edit] Output
address of P1:0x00323C88 value at p1:23 address of P2:0x00323C88 value at p2:23 value in P2 after delete:0x00323C88 value at P2 after delete:-572662307 value in i:67 value in i after copy:67
[edit] Copying in Java
Unlike in C++, objects in Java are always accessed indirectly through references. Objects are never created implicitly but instead are always passed or assigned by reference. The virtual machine takes care of reference counting so that objects are cleaned up after no longer in use. There is no automatic way to copy any given object in Java.
Copying is usually performed by a clone() method method of a class. This method usually, in turn, calls the clone() method of its parent class to obtain a copy, and then does any custom copying procedures. Eventually this gets to the clone() method of Object
(the uppermost class), which creates a new instance of the same class as the object and copies all the fields to the new instance (a "shallow copy"). If this method is used, the class must implement the Cloneable
marker interface, or else it will throw a CloneNotSupportedException. After obtaining a copy from the parent class, a class's own clone() method may then provide custom cloning capability, like deep copying (i.e. duplicate some of the structures referred to by the object) or giving the new instance a new unique ID.
One disadvantage is that the return type of clone() is Object
, and needs to be explicitly cast back into the appropriate type (technically a custom clone() method could return another type of object; but that is generally inadvisable). One advantage of using clone() is that since it is an overridable method, we can call clone() on any object, and it will use the clone() method of its actual class, without the calling code needing to know what that class is (which would be necessary with a copy constructor).
Another disadvantage is that one often cannot access the clone() method on an abstract type. Most interfaces and abstract classes in Java do not specify a public clone() method. As a result, often the only way to use the clone() method is if you know the actual class of an object; which is contrary to the abstraction principle of using the most generic type possible. For example, if one has a List reference in Java, one cannot invoke clone() on that reference because List specifies no public clone() method. Actual implementations of List like ArrayList and LinkedList all generally have clone() methods themselves, but it is inconvenient and bad abstraction to carry around the actual class type of an object.
Another way to copy objects in Java is to serialize them through the Serializable
interface. This is typically used for persistence and wire protocol purposes, but it does create copies of objects and unlike clone, a deep copy that gracefully handles cycled graphs of objects is readily available with minimal effort from the programmer.
Both of these methods suffer from a notable problem: the constructor is not used for objects copied with clone or serialization. This can lead to bugs with improperly initialized data, prevents the use of final
member fields, and makes maintenance challenging.
[edit] See also
[edit] References
- Why Copying an Object is a terrible thing to do? This article addresses the issue of copying objects and the correct way to write the code for copying objects.