We present a solution for monitoring nocturnal giraffe behavior by reducing several hours of thermal camera surveillance footage into a short video summary which can be reviewed by experts. We formulate the video summarization task as a tracking problem: frames in which giraffes are successfully tracked are presumed to be typical poses/behaviors and not included in the summary; whereas frames containing track initializations or terminations are presumed to be atypical events and are therefore included in the summary. To implement our tracking-by-detection summarization approach, we explore various combinations of image features to determine the best combination for long infrared spectrum cameras, and devise a variant of the deformable parts model object detection technique using geodesic distances to handle the extreme variations of typical giraffe postures. Finally, we evaluate our summarization performance in terms of recall and compressibility, and show how a trade-off exists between these two measures using more fragile or robust tracking techniques.