Matching Items (4)

137787-Thumbnail Image.png

Comparison of MIMD and SIMT Parallel Iterative Solvers for Laplace's Equation

Description

A comparison of the performance of CUDA versus OpenMP for Jacobi, Gauss-Seidel, and S.O.R. iterative methods for Laplace's Equation with Dirichlet boundary conditions is presented. Both the number of cores

A comparison of the performance of CUDA versus OpenMP for Jacobi, Gauss-Seidel, and S.O.R. iterative methods for Laplace's Equation with Dirichlet boundary conditions is presented. Both the number of cores and the grid size were varied for the OpenMP program, while the grid size was varied for the CUDA program. CUDA outperforms the 8-core OpenMP program with the Jacobi and Gauss-Seidel schemes for all grid sizes, and is competitive with S.O.R for all grid sizes examined.

Contributors

Agent

Created

Date Created
  • 2013-05

152030-Thumbnail Image.png

Study of interface adhesive properties of wurtzite materials for carbon fiber composites

Description

Recently, the use of zinc oxide (ZnO) nanowires as an interphase in composite materials has been demonstrated to increase the interfacial shear strength between carbon fiber and an epoxy matrix.

Recently, the use of zinc oxide (ZnO) nanowires as an interphase in composite materials has been demonstrated to increase the interfacial shear strength between carbon fiber and an epoxy matrix. In this research work, the strong adhesion between ZnO and carbon fiber is investigated to elucidate the interactions at the interface that result in high interfacial strength. First, molecular dynamics (MD) simulations are performed to calculate the adhesive energy between bare carbon and ZnO. Since the carbon fiber surface has oxygen functional groups, these were modeled and MD simulations showed the preference of ketones to strongly interact with ZnO, however, this was not observed in the case of hydroxyls and carboxylic acid. It was also found that the ketone molecules ability to change orientation facilitated the interactions with the ZnO surface. Experimentally, the atomic force microscope (AFM) was used to measure the adhesive energy between ZnO and carbon through a liftoff test by employing highly oriented pyrolytic graphite (HOPG) substrate and a ZnO covered AFM tip. Oxygen functionalization of the HOPG surface shows the increase of adhesive energy. Additionally, the surface of ZnO was modified to hold a negative charge, which demonstrated an increase in the adhesive energy. This increase in adhesion resulted from increased induction forces given the relatively high polarizability of HOPG and the preservation of the charge on ZnO surface. It was found that the additional negative charge can be preserved on the ZnO surface because there is an energy barrier since carbon and ZnO form a Schottky contact. Other materials with the same ionic properties of ZnO but with higher polarizability also demonstrated good adhesion to carbon. This result substantiates that their induced interaction can be facilitated not only by the polarizability of carbon but by any of the materials at the interface. The versatility to modify the magnitude of the induced interaction between carbon and an ionic material provides a new route to create interfaces with controlled interfacial strength.

Contributors

Agent

Created

Date Created
  • 2013

154096-Thumbnail Image.png

Performance optimization of linux networking for latency-sensitive virtual systems

Description

Virtual machines and containers have steadily improved their performance over time as a result of innovations in their architecture and software ecosystems. Network functions and workloads are increasingly migrating

Virtual machines and containers have steadily improved their performance over time as a result of innovations in their architecture and software ecosystems. Network functions and workloads are increasingly migrating to virtual environments, supported by developments in software defined networking (SDN) and network function virtualization (NFV). Previous performance analyses of virtual systems in this context often ignore significant performance gains that can be acheived with practical modifications to hypervisor and host systems. In this thesis, the network performance of containers and virtual machines are measured with standard network performance tools. The performance of these systems utilizing a standard 3.18.20 Linux kernel is compared to that of a realtime-tuned variant of the same kernel. This thesis motivates improving determinism in virtual systems with modifications to host and guest kernels and thoughtful process isolation. With the system modifications described, the median TCP bandwidth of KVM virtual machines over bridged network interfaces, is increased by 10.8% with a corresponding reduction in standard deviation of 87.6%. Docker containers see a 8.8% improvement in median bandwidth and 4.4% reduction in standard deviation of TCP measurements using similar bridged networking. System tuning also reduces the standard deviation of TCP request/response latency (TCP RR) over bridged interfaces by 86.8% for virtual machines and 97.9% for containers. Hardware devices assigned to virtual systems also see reductions in variance, although not as noteworthy.

Contributors

Agent

Created

Date Created
  • 2015

150460-Thumbnail Image.png

Improving CGRA utilization by enabling multi-threading for power-efficient embedded systems

Description

Performance improvements have largely followed Moore's Law due to the help from technology scaling. In order to continue improving performance, power-efficiency must be reduced. Better technology has improved power-efficiency, but

Performance improvements have largely followed Moore's Law due to the help from technology scaling. In order to continue improving performance, power-efficiency must be reduced. Better technology has improved power-efficiency, but this has a limit. Multi-core architectures have been shown to be an additional aid to this crusade of increased power-efficiency. Accelerators are growing in popularity as the next means of achieving power-efficient performance. Accelerators such as Intel SSE are ideal, but prove difficult to program. FPGAs, on the other hand, are less efficient due to their fine-grained reconfigurability. A middle ground is found in CGRAs, which are highly power-efficient, but largely programmable accelerators. Power-efficiencies of 100s of GOPs/W have been estimated, more than 2 orders of magnitude greater than current processors. Currently, CGRAs are limited in their applicability due to their ability to only accelerate a single thread at a time. This limitation becomes especially apparent as multi-core/multi-threaded processors have moved into the mainstream. This limitation is removed by enabling multi-threading on CGRAs through a software-oriented approach. The key capability in this solution is enabling quick run-time transformation of schedules to execute on targeted portions of the CGRA. This allows the CGRA to be shared among multiple threads simultaneously. Analysis shows that enabling multi-threading has very small costs but provides very large benefits (less than 1% single-threaded performance loss but nearly 300% CGRA throughput increase). By increasing dynamism of CGRA scheduling, system performance is shown to increase overall system performance of an optimized system by almost 350% over that of a single-threaded CGRA and nearly 20x faster than the same system with no CGRA in a highly threaded environment.

Contributors

Agent

Created

Date Created
  • 2011