Scheduling theory is a common tool to analyze the performance of parallel and distributed computing systems, such as their load balance. How to distribute the input data to be able to execute a set of tasks in a minimum amount of time can be modeled as a scheduling problem. Often these models assume that the computation time required for each task is known accurately. However in many practical case, only approximate values are available at the time of scheduling. This thesis research investigates how replicating the data required by the tasks can help coping with the inaccuracies of the processing times. In particular, it investigates the problem of scheduling independent tasks to optimize the makespan on a parallel system where the processing times of tasks are only known up to a multiplicative factor. The problem is decomposed in two phases: a first offline phase where the data of the tasks are placed and a second online phase where the tasks are actually scheduled. For this problem, this thesis investigates three different strategies, each allowing a different degree of replication of jobs: a) No Replication b) Replication everywhere and c) Replication in groups, and proposes approximation algorithms and theoretical lower bound on achievable approximation ratios. This allows us to study the tradeoff between the number of replication and the guarantee on the makespan. Replication improves performance but incurs a cost in terms of memory consumption. The objective is then to develop scheduling algorithm with good competitive ratio to minimize both the makespan of the schedule and the memory consumption of the machines.